NGC Catalog
CLASSIC
Welcome Guest
Collections
Blueprint - Build an enterprise rag pipeline

Blueprint - Build an enterprise rag pipeline

For contents of this collection and more information, please view on a desktop device.
Logo for Blueprint - Build an enterprise rag pipeline
Description
The NVIDIA AI Blueprint for RAG gives developers a foundational starting point for building scalable, customizable retrieval pipelines that deliver both high accuracy and throughput with multi-modal data.
Curator
Modified
March 18, 2025
Containers
Sorry, your browser does not support inline SVG.
Helm Charts
Sorry, your browser does not support inline SVG.
Models
Sorry, your browser does not support inline SVG.
Resources
Sorry, your browser does not support inline SVG.

Overview

The NVIDIA AI Blueprint for RAG gives developers a foundational starting point for building scalable, customizable retrieval pipelines that deliver both high accuracy and throughput. Use this blueprint to create RAG applications that provide context-aware responses by connecting LLMs to extensive multimodal enterprise data—an essential capability for most generative AI use cases. This blueprint can be utilized as-is, combined with other NVIDIA Blueprints, such as the Digital Human Blueprint or the AI Virtual Assistant for customer service, or integrated with an agent to support more advanced use cases. Get started with this reference architecture to unlock actionable insights, ground your decisions in relevant data, and boost overall productivity.

RAG architecture diagram

Key features

  • Multimodal data extraction support with text, tables, charts, and infographics
  • Hybrid search with dense and sparse search
  • Opt-in image captioning with vision language models (VLMs)
  • Reranking to further improve accuracy
  • GPU-accelerated Index creation and search
  • Multi-turn conversations
  • Multi-session support
  • Telemetry and observability
  • Opt-in for reflection to improve accuracy
  • Opt-in for guardrailing conversations
  • Sample user interface
  • OpenAI-compatible APIs
  • Decomposable and customizable

Software used in this blueprint

NVIDIA Technology

  • NeMo Retriever Llama 3.2 Embedding NIM
  • NeMo Retriever Llama 3.2 Reranking NIM
  • Llama-3.1-70B-Instruct-NIM
  • NeMo Retriever Page Elements NIM
  • NeMo Retriever Table Structure NIM
  • NeMo Retriever Graphic Elements NIM
  • PaddleOCR NIM

Optional

  • NeMo Retriever Parse NIM
  • Llama 3.1 NemoGuard 8B Content Safety NIM
  • Llama 3.1 NemoGuard 8B Topic Control NIM
  • Llama 3.2 11B Vision Instruct NIM
  • Mixtral 8x22B Instruct 0.1

3rd Party Software

  • LangChain
  • Milvus database (accelerated with NVIDIA cuVS

Source Code

Documentation and source code regarding how to get started can be found here

Additional Resources

Learn more about how to use NVIDIA NIM microservices for RAG through our Deep Learning Institute. Access the course here.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.

License

Use of the models in this blueprint is governed by the NVIDIA AI Foundation Models Community License

Terms of use

The software and materials are governed by the NVIDIA Software License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and the Product-Specific Terms for NVIDIA AI Products (found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/), except that models are governed by the AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA Community Model License) and the NVIDIA RAG dataset is governed by the NVIDIA Asset License Agreement (found at https://github.com/NVIDIA-AI-Blueprints/rag/blob/main/data/LICENSE.DATA). ADDITIONAL INFORMATION: for Meta/llama-3.1-70b-instruct model the Llama 3.1 Community License Agreement, for nvidia/llama-3.2-nv-embedqa-1b-v2model the Llama 3.2 Community License Agreement, and for for nvidia/llama-3.2-nv-embedqa-1b-v2 model the Llama 3.2 Community License Agreement. Built with Llama.