SearchSearch thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
NVIDIA Enterprise
NVIDIA Enterprise
20
13
11
4
2
1
1
NVIDIA NIM
NVIDIA NIM
3
NIM Container GPUs
NIM Container GPUs
Use Case
Use Case
13
12
10
7
5
4
3
3
3
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
NVIDIA Platform
NVIDIA Platform
16
7
6
5
4
4
4
3
2
2
2
2
2
2
1
1
1
1
1
1
1
1
Industry
Industry
8
7
7
7
7
7
4
4
3
3
2
2
1
1
1
Solution
Solution
384
262
193
173
165
103
101
101
101
47
44
44
34
16
14
13
12
10
10
9
9
4
3
3
2
1
1
Publisher
Publisher
89
2
1
1
1
Policy
Policy
4
Displaying 101 results
Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.
Container
NVIDIA
NVIDIA
TensorRT
NVIDIA TensorRT is a C++ library that facilitates high-performance inference on NVIDIA graphics processing units (GPUs). TensorRT takes a trained network and produces a highly optimized runtime engine that performs inference for that network.
Container
NVIDIA Developer Program
GenMol is a masked diffusion model trained on molecular SAFE representations for fragment-based molecule generation, which can serve as a generalist model for various drug discovery tasks.
Container
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment
Container
NVIDIA
NVIDIA
vLLM
vLLM is a fast and easy-to-use library for LLM inference and serving. The NVIDIA vLLM NGC Container is optimized for GPU acceleration, and contains a validated set of libraries that enable and optimize GPU performance.
Container
NVIDIA Developer Program
Diffdock predicts the 3D structure of the interaction between a molecule and a protein.
Container
The Dynamo vLLM runtime image is a containerized build of Dynamo + vLLM which serves as the base runtime environment for vLLM based inference with Dynamo's distributed inference framework.
Container
The Dynamo TensorRT-LLM runtime image is a containerized build of Dynamo + TensorRT-LLM which serves as the base runtime environment for tensorrt-llm based inference with Dynamo's distributed inference framework.
Container
NVIDIA
NVIDIA
CUDA GL
CUDA is a parallel computing platform and programming model that enables dramatic increases in computing performance by harnessing the power of the NVIDIA GPUs. These images extend the CUDA images to include OpenGL support through libglvnd.
Container
Helm chart for NIM Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment
Helm Chart
A comprehensive Helm chart for deploying the NVIDIA Dynamo operator and its dependencies
Helm Chart
kubernetes-operator is a container that runs as part of the Dynamo cloud platform. Dynamo cloud is a kubernetes platform for deploying and managing inference services. This container manages the lifecycle of Dynamo inference deployments in kubernetes.
Container
Scripts and utilities for getting started with Riva Speech Skills
Resource
The Dynamo SGLang runtime image is a containerized build of Dynamo + SGLang which serves as the base runtime environment for sglang based inference with Dynamo's distributed inference framework.
Container
NVIDIA AI Enterprise
Triton Inference Server Production Branch October 2024 (PB 24h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.
Container
NVIDIA AI Enterprise
Triton Inference Server PB May 2025 (PB 25h1) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.
Container
The Merlin PyTorch container allows users to do preprocessing and feature engineering with NVTabular, and then train a deep-learning based recommender system model with PyTorch, and serve the trained model on Triton Inference Server.
Container
NVCF ClusterAgent (NVCA) is a self-installable micro-service that enables a compute backend, DGX Clouds or other NVIDIA Compute Backends to be used as an NVCF Target to host NVIDIA Cloud Functions’ instances.
Container
NVIDIA AI Enterprise
Triton Inference Server PB October 2025 (PB 25h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.
Container
NVCF ClusterAgent (NVCA) is a self-installable micro-service that enables a compute backend to host NVIDIA Cloud Functions’ instances. NVCA Webhook Server is used by NVCA to enforce restrictions for NVCF Mini Service.
Container
A Helm chart that manages Custom Resource Definitions (CRDs) for the NVIDIA Dynamo ecosystem in Kubernetes
Helm Chart
NVIDIA
NVIDIA
Morpheus
NVIDIA Morpheus is an open AI application framework for cybersecurity developers.
Container
NVIDIA AI Enterprise
Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.
Container
The Merlin TensorFlow container allows users to do preprocessing and feature engineering with NVTabular, and then train a deep-learning based recommender system model with TensorFlow, and serve the trained model on Triton Inference Server.
Container

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.