NVIDIA
NVIDIA
NeMo Microservices
Collection
NVIDIA
NVIDIA
NeMo Microservices

This collection contains NeMo microservices that provide a comprehensive suite of features to build an end-to-end platform for fine-tuning, evaluating, and serving large language models (LLMs) on your Kubernetes cluster.

Overview

NVIDIA NeMo microservices provide a modular platform for building and deploying AI workflows across on-premises or cloud Kubernetes environments. These microservices leverage proprietary data and enable continuous optimization of AI applications through a data flywheel architecture.

Before You Start

Quick Start

Try out NeMo microservices locally using Docker Compose for experimentation before deploying on Kubernetes:

Functional Microservices

Functional microservices for LLMs and embedding models:

MicroserviceDescriptionDocumentation
NVIDIA NeMo Data DesignerGenerates high-quality synthetic datasets using AI models, statistical sampling, and configurable data schemas.Design Synthetic Data From Scratch or Seeds
NVIDIA NeMo Safe Synthesizer (Early Access)Creates private versions of sensitive tabular datasets using PII replacement, privacy-protecting synthesis with optional differential privacy, and comprehensive quality and privacy evaluation.Generate Private Synthetic Data
NVIDIA NeMo CustomizerFine-tunes LLMs and embedding models using supervised and parameter-efficient fine-tuning techniques.Fine-Tune Models
NVIDIA NeMo EvaluatorEvaluates LLMs and embedding models with academic benchmarks, custom automated evaluations, and LLM-as-a-Judge approaches.Evaluate Models
NVIDIA NeMo GuardrailsAdds safety checks and content moderation to LLM endpoints to protect against hallucinations, harmful content, and security vulnerabilities.Manage Guardrails
NVIDIA NeMo Auditor (Early Access)Audits models and agentic applications for security vulnerabilities and harmful content.Audit Model Safety

Infrastructure Microservices

The following are the microservices that form the infrastructure for the functional microservices.

MicroserviceDescriptionDocumentation
NVIDIA NeMo Data StoreDefault file storage for the NeMo microservices platform with APIs compatible with the Hugging Face Hub client (HfApi).Manage Entities
NVIDIA NeMo Entity StoreManages and organizes entities such as namespaces, projects, datasets, and models.Manage Entities
NVIDIA NeMo Deployment ManagementDeploys and manages NIM for LLMs on Kubernetes clusters through the NIM Operator microservice.Run Inference with NIM
NVIDIA NeMo NIM ProxyUnified endpoint to access all deployed NIM for LLMs for inference tasks.Run Inference with NIM
NVIDIA NeMo OperatorManages custom resource definitions (CRDs) for NVIDIA NeMo Customizer fine-tuning jobs.

Graphical User Interface Microservice

The NeMo microservices platform provides Studio, a web interface microservice for managing AI development workflows.

MicroserviceDescriptionDocumentation
NVIDIA NeMo StudioWeb-based UI for testing models in a chat playground, launching fine-tuning jobs, running evaluations, and organizing projects, models, and datasets.Manage AI Development Workflows with NVIDIA NeMo Studio

Helm Chart

Use the NeMo Microservices Helm chart
to install the complete platform or individual microservices with their
dependencies. This parent Helm chart simplifies deployment by bundling all NeMo
microservices into one chart.

You can install the platform in two ways:

  • Complete Platform Installation (default): Installs all generally available NeMo microservices by setting tags.platform=true in the values file.
  • Tag-Based Installation: Install specific microservices by setting tags.platform=false and enabling individual tags such as tags.customizer=true or tags.auditor=true.

For detailed installation instructions, refer to Tag-Based Helm Installation.

Resources

Get Help

Enterprise Support

NVIDIA AI Enterprise Support offers access to various resources. For additional assistance, submit a support ticket.

Documentation

Visit the NVIDIA NeMo Microservices Documentation website for getting started guides, tutorials, deployment guides, and more.

Governing Terms

The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.