GPU-optimized AI, Machine Learning, & HPC Software

NVIDIA

Docker containers distributed as part of the TAO Toolkit package

Container

NVIDIA

PyTorch is a GPU accelerated tensor computational framework. Functionality can be extended with common Python libraries such as NumPy and SciPy. Automatic differentiation is done with a tape-based system at the functional and neural network layer levels.

Container

NVIDIA

TensorRT LLM Release

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

Container

NVIDIA Developer Program

Nvidia

NVIDIA NIM for GenMol

GenMol is a masked diffusion model trained on molecular SAFE representations for fragment-based molecule generation, which can serve as a generalist model for various drug discovery tasks.

Container

NVIDIA

NVIDIA NIM Operator

An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment

Container

NVIDIA

vLLM

vLLM is a fast and easy-to-use library for LLM inference and serving. The NVIDIA vLLM NGC Container is optimized for GPU acceleration, and contains a validated set of libraries that enable and optimize GPU performance.

Container

NVIDIA Developer Program

NVIDIA

Parakeet 0.6b CTC en-US NIM

Parakeet 0.6b CTC en-US NIM delivers accurate English speech-to-text transcription and enables easy-to-use optimized ASR inference for large scale deployments.

Container

NVIDIA

CUDA Deep Learning

CUDA is a parallel computing platform and programming model that enhances computing performance using NVIDIA GPUs. CUDA Deep Learning integrates networking and GPU-accelerated libraries like cuDNN, cuTensor, NCCL, HPC-x, and the CUDA Toolkit.

Container

NVIDIA Developer Program

NVIDIA

NV-CLIP

NV-CLIP NIM microservice for multimodal embeddings model for image and text

Container

NVIDIA Developer Program

MIT

DiffDock

Diffdock predicts the 3D structure of the interaction between a molecule and a protein.

Container

NVIDIA Developer Program

NVIDIA

MolMIM

MolMIM is a transformer-based model developed by NVIDIA for controlled small molecule generation.

Container

NVIDIA

rag-server

This is the RAG server container used as part of the NVIDIA AI Blueprint for RAG and used to orchestrate the end to end RAG pipeline.

Container

NVIDIA

Dynamo vLLM Runtime

The Dynamo vLLM runtime image is a containerized build of Dynamo + vLLM which serves as the base runtime environment for vLLM based inference with Dynamo's distributed inference framework.

Container

NVIDIA

Dynamo Tensorrt-LLM Runtime

The Dynamo TensorRT-LLM runtime image is a containerized build of Dynamo + TensorRT-LLM which serves as the base runtime environment for tensorrt-llm based inference with Dynamo's distributed inference framework.

Container

NVIDIA

JAX

JAX is a framework for high-performance numerical computing and machine learning research. It includes Numpy-like APIs, automatic differentiation, XLA acceleration and simple primitives for scaling across GPUs and supports an ecosystem of libraries.

Container

NVIDIA

PyG

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

Container

NVIDIA

VSS Engine

Build a Video Search and Summarization Agent Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

Container

NVIDIA

NVIDIA NIM Operator

Helm chart for NIM Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment

Helm Chart

NVIDIA

ingestor-server

This is the Ingestor server container used as part of the NVIDIA RAG Blueprint and used to orchestrate the end to end Ingestion.

Container

NVIDIA

Dynamo Platform

A comprehensive Helm chart for deploying the NVIDIA Dynamo operator and its dependencies

Helm Chart

NVIDIA

BioNeMo Framework

BioNeMo Framework for running training and inference on large scale bio-based models.

Container

NVIDIA

Dynamo kubernetes-operator

kubernetes-operator is a container that runs as part of the Dynamo cloud platform. Dynamo cloud is a kubernetes platform for deploying and managing inference services. This container manages the lifecycle of Dynamo inference deployments in kubernetes.

Container

NVIDIA

Riva Skills Quick Start

Scripts and utilities for getting started with Riva Speech Skills

Resource

NVIDIA

nemo-rl

NVIDIA NeMo™ RL accelerates reinforcement learning post-training with high-performance GPU backends, offering scalable GRPO, DPO, SFT, and distillation for multimodal models from single-node experiments to enterprise-scale clusters.

Container