NVIDIA K8s Developer LLM Operator

NGC Catalog

CLASSIC

Welcome Guest

For versions and more information, please view on a desktop device.

Description

The NVIDIA K8s Developer LLM Operator is an open source and easy to deploy Kubernetes Operator to self-host Generative AI workflows.

Publisher

NVIDIA

Latest Version

0.1.0

Compressed Size

34.85 KB

Modified

December 19, 2023

NVIDIA K8s Developer LLM Operator

The NVIDIA K8s Developer LLM Operator is an open source and easy to deploy Kubernetes Operator to self-host Generative AI workflows.

In the initial release, the operator deploys a reference Retrieval Augmented Generation(RAG) workflow for a chatbot to question answer off public press releases & tech blogs. It performs document ingestion & Q&A interface using open source models deployed on any cloud or customer datacenter, leverages the power of GPU-accelerated Milvus for efficient vector storage and retrieval, along with TRT-LLM, to achieve lightning-fast inference speeds with custom LangChain LLM wrapper.

Product Documentation

For information on support and getting started, visit the official documentation on GitHub

Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Support and Getting Help

In each of the READMEs within the GitHub repository, we indicate the level of support provided.

License

By pulling and using the container, you accept the terms and conditions of the NVIDIA AI Product License.