This reference application demonstrates how to use NVIDIA NIM inference microservices to develop generative AI-powered chatbots using retrieval-augmented generation (RAG).
This application showcases how to build canonical RAG-based chatbots utilizing LangChain and LlamaIndex. Included are advanced applications with Q&A chatbots based on multimodal and structured datasets, as well as agentic RAG-based chatbots.
Leverage NVIDIA NIM—including Llama-3-8b-instruct, NVIDIA NeMoTM Retriever embedding and reranking models—in end-to-end sample RAG chain servers using LangChain, LlamaIndex, PandasAI, and a GPU-accelerated Milvus vector database. Deploy with either Helm charts or Docker Compose, optimized for LLM inference performance and scaling.
Key benefits of the workflow include:
To get started with Helm charts, select any of the following applications and follow the instructions mentioned in the overview section.
To get started with docker workflow follow the instructions mentioned in the following NGC resource
Documentation and source code regarding each of the above reference architectures can be found in NVIDIA GenerativeAIExamples GitHub repo.
Learn more about how to use NVIDIA NIM microservices for RAG through our Deep Learning Institute. Access the course here.
The RAG applications are shared as reference architectures and are provided “as is”. The security of them in production environments is the responsibility of the end users deploying it. When deploying in a production environment, please have security experts review any potential risks and threats (including direct and indirect prompt injection); define the trust boundaries, secure the communication channels, integrate AuthN & AuthZ with appropriate access controls, keep the deployment including the containers up to date, ensure the containers are secure and free of vulnerabilities.
By downloading or using NVIDIA NIM inference microservices included in the workflow you agree to the terms of the NVIDIA Software License Agreement and Product-specific Terms for AI products.