This collection contains NeMo microservices that provide a comprehensive suite of features to build an end-to-end platform for fine-tuning, evaluating, and serving large language models (LLMs) on your Kubernetes cluster.
Overview
NVIDIA NeMo microservices provide a modular platform for building and deploying AI workflows across on-premises or cloud Kubernetes environments. These microservices leverage proprietary data and enable continuous optimization of AI applications through a data flywheel architecture.
Before You Start
- Use, distribution, or deployment of these microservices in production requires an NVIDIA AI Enterprise License.
Quick Start
Try out NeMo microservices locally using Docker Compose for experimentation before deploying on Kubernetes:
- Data Designer Quickstart
- Safe Synthesizer Quickstart
- Evaluator Quickstart
- Guardrails Quickstart
- Auditor Quickstart
Functional Microservices
Functional microservices for LLMs and embedding models:
| Microservice | Description | Documentation |
|---|---|---|
| NVIDIA NeMo Data Designer | Generates high-quality synthetic datasets using AI models, statistical sampling, and configurable data schemas. | Design Synthetic Data From Scratch or Seeds |
| NVIDIA NeMo Safe Synthesizer (Early Access) | Creates private versions of sensitive tabular datasets using PII replacement, privacy-protecting synthesis with optional differential privacy, and comprehensive quality and privacy evaluation. | Generate Private Synthetic Data |
| NVIDIA NeMo Customizer | Fine-tunes LLMs and embedding models using supervised and parameter-efficient fine-tuning techniques. | Fine-Tune Models |
| NVIDIA NeMo Evaluator | Evaluates LLMs and embedding models with academic benchmarks, custom automated evaluations, and LLM-as-a-Judge approaches. | Evaluate Models |
| NVIDIA NeMo Guardrails | Adds safety checks and content moderation to LLM endpoints to protect against hallucinations, harmful content, and security vulnerabilities. | Manage Guardrails |
| NVIDIA NeMo Auditor (Early Access) | Audits models and agentic applications for security vulnerabilities and harmful content. | Audit Model Safety |
Infrastructure Microservices
The following are the microservices that form the infrastructure for the functional microservices.
| Microservice | Description | Documentation |
|---|---|---|
| NVIDIA NeMo Data Store | Default file storage for the NeMo microservices platform with APIs compatible with the Hugging Face Hub client (HfApi). | Manage Entities |
| NVIDIA NeMo Entity Store | Manages and organizes entities such as namespaces, projects, datasets, and models. | Manage Entities |
| NVIDIA NeMo Deployment Management | Deploys and manages NIM for LLMs on Kubernetes clusters through the NIM Operator microservice. | Run Inference with NIM |
| NVIDIA NeMo NIM Proxy | Unified endpoint to access all deployed NIM for LLMs for inference tasks. | Run Inference with NIM |
| NVIDIA NeMo Operator | Manages custom resource definitions (CRDs) for NVIDIA NeMo Customizer fine-tuning jobs. |
Graphical User Interface Microservice
The NeMo microservices platform provides Studio, a web interface microservice for managing AI development workflows.
| Microservice | Description | Documentation |
|---|---|---|
| NVIDIA NeMo Studio | Web-based UI for testing models in a chat playground, launching fine-tuning jobs, running evaluations, and organizing projects, models, and datasets. | Manage AI Development Workflows with NVIDIA NeMo Studio |
Helm Chart
Use the NeMo Microservices Helm chart
to install the complete platform or individual microservices with their
dependencies. This parent Helm chart simplifies deployment by bundling all NeMo
microservices into one chart.
You can install the platform in two ways:
- Complete Platform Installation (default): Installs all generally available NeMo microservices by setting
tags.platform=truein the values file. - Tag-Based Installation: Install specific microservices by setting
tags.platform=falseand enabling individual tags such astags.customizer=trueortags.auditor=true.
For detailed installation instructions, refer to Tag-Based Helm Installation.
Resources
- NVIDIA NeMo Product Page
- Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices (blog)
- Customizing AI Agents for Tool Calling with NVIDIA NeMo Microservices (video)
Get Help
Enterprise Support
NVIDIA AI Enterprise Support offers access to various resources. For additional assistance, submit a support ticket.
Documentation
Visit the NVIDIA NeMo Microservices Documentation website for getting started guides, tutorials, deployment guides, and more.
Governing Terms
The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.