NGC | Catalog
CatalogContainersNVIDIA K8s Developer LLM Operator

NVIDIA K8s Developer LLM Operator

Logo for NVIDIA K8s Developer LLM Operator
Features
Description
The NVIDIA K8s Developer LLM Operator is an open source and easy to deploy Kubernetes Operator to self-host Generative AI workflows.
Publisher
NVIDIA
Latest Tag
v0.1.0
Modified
April 4, 2024
Compressed Size
68.19 MB
Multinode Support
No
Multi-Arch Support
No
v0.1.0 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

NVIDIA K8s Developer LLM Operator

The NVIDIA K8s Developer LLM Operator is an open source and easy to deploy Kubernetes Operator to self-host Generative AI workflows.

In the initial release, the operator deploys a reference Retrieval Augmented Generation(RAG) workflow for a chatbot to question answer off public press releases & tech blogs. It performs document ingestion & Q&A interface using open source models deployed on any cloud or customer datacenter, leverages the power of GPU-accelerated Milvus for efficient vector storage and retrieval, along with TRT-LLM, to achieve lightning-fast inference speeds with custom LangChain LLM wrapper.

Product Documentation

For information on support and getting started, visit the official documentation on GitHub

Contributions

We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!

Support and Getting Help

In each of the READMEs within the GitHub repository, we indicate the level of support provided.

License

By pulling and using the container, you accept the terms and conditions of the NVIDIA AI Product License.