GPU-optimized AI, Machine Learning, & HPC Software

NVIDIA

PyTorch is a GPU accelerated tensor computational framework. Functionality can be extended with common Python libraries such as NumPy and SciPy. Automatic differentiation is done with a tape-based system at the functional and neural network layer levels.

Container

NVIDIA

NeMo Framework Megatron Backend

NVIDIA NeMo™ framework Megatron backend supports pre-training, post-training, and reinforcement learning of LLMs and multi-modal generative AI models with state-of-the-art data processing, model training techniques, and flexible deployment options.

Container

NVIDIA

vLLM

vLLM is a fast and easy-to-use library for LLM inference and serving. The NVIDIA vLLM NGC Container is optimized for GPU acceleration, and contains a validated set of libraries that enable and optimize GPU performance.

Container

NVIDIA Developer Program

NVIDIA

Llama-3.3-nemotron-super-49b-v1.5

This container houses the Llama-3.3-Nemotron-Super-49B-v1.5, which is a significantly upgraded version of Llama-3.3-Nemotron-Super-49B-v1 and is a large language model (LLM) which is a derivative of Meta Llama-3.3-70B-Instruct

Container

NVIDIA Developer Program

Stockmark

Stockmark-2-100B-Instruct

NVIDIA NIM for GPU accelerated Stockmark-2-100B-Instruct inference through OpenAI compatible APIs

Container

NVIDIA

Foundational Flywheel Server

Turn workload evaluation and optimization requests into a fully automated, end-to-end workflow across many specialized micro-services with the Flywheel orchestrator control-plane service.

Container

NVIDIA

Morpheus

NVIDIA Morpheus is an open AI application framework for cybersecurity developers.

Container

NVIDIA AI Enterprise

NVIDIA

Llama-3.3-nemotron-super-49b-v1.5-PB-25h2

This container houses the **Llama-3.3-Nemotron-Super-49B-v1.5 PB 25h2**, which is a significantly upgraded version of Llama-3.3-Nemotron-Super-49B-v1 and is a large language model (LLM) which is a derivative of Meta Llama-3.3-70B-Instruct

Container

NVIDIA

Kaldi

Kaldi is an open-source software framework for speech processing.

Container

NVIDIA

SGLang

SGLang is a fast serving framework for large language models and vision language models. The NVIDIA SGLang NGC Container is optimized for GPU acceleration, and contains a validated set of libraries that enable and optimize GPU performance.

Container

NVIDIA AI Enterprise

NVIDIA

NVIDIA Morpheus PB May 2024 (PB 24h1)

NVIDIA Morpheus Production Branch May 2024 (PB 24h1) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.

Container

NVIDIA

Nemotron-Content-Safety-Reasoning-4B (Experimental)

Nemotron Content Safety Reasoning 4B is a Large Language Model (LLM) classifier designed to function as a dynamic and adaptable guardrail for content safety and dialogue moderation (topic-following).

Container

NVIDIA

Morpheus Triton Server Models

The Morpheus Triton Server Models Container builds upon the NVIDIA Triton Inference Server container by adding the Morpheus pre-trained models.

Container

NVIDIA AI Enterprise

NVIDIA

NVIDIA Morpheus PB May 2025 (PB 25h1)

NVIDIA Morpheus Production Branch May 2025 (PB 25h1) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.

Container

Xiaomi

Mimo-V2-Flash (Experimental)

This container houses the model MiMo-V2-Flash.

Container

NVIDIA AI Enterprise

NVIDIA

NVIDIA Morpheus PB October 2024 (PB 24h2)

NVIDIA Morpheus Production Branch October 2024 (PB 24h2) offers a 9-month lifecycle for API stability, with monthly patches for high and critical software vulnerabilities.

Container

OpenAI

GPT-OSS-120b-Turbo

The GPT-OSS-120b-Turbo NIM container packages OpenAI's GPT-OSS-120b large language model, a sparse Mixture of Experts (MoE) architecture with 120B total parameters and 5.1B active parameters, as an NVIDIA NIM microservice.

Container

NVIDIA

Nemotron-3-Ultra-550B-A55B

Nemotron-3-Ultra-550B-A55B NIM container packages NVIDIA's large language model featuring a hybrid Latent Mixture-of-Experts (LatentMoE) architecture with Multi-Token Prediction (MTP) layers.

Container

NVIDIA

NIM Agent Blueprint for Vulnerability Analysis

The Vulnerability Analysis for Container Security is a NIM Agent Blueprint that dramatically accelerates vulnerability detection and mitigation with generative AI and the Morpheus cybersecurity SDK.

Container

NVIDIA

Text Classification with BERT and NeMo

Text Classification with BERT and NeMo. This NeMo application trains text classification models using single-GPU or multi-GPU. We log performance metrics and visualize them with TensorBoard. We show how to do inference with NeMo, and we visualize BERT embeddings before and after fine-tuning.

Container

NVIDIA

DLI NLP Course - Base Environment with NeMo

Base environment used in the NVIDIA NeMo projects of the NVIDIA Deep Learning Institute (DLI) course, "Building Transformer-Based Natural Language Processing Applications". This container also includes a "Next Steps" project.

Container

NVIDIA

Fine-Tune and Optimize BERT

Jupyter Notebooks for BERT Pre-training, Fine-Tuning and Inference profiling and optimization via TensorFlow, AMP, XLA, DLProf, TF-TRT and Triton.

Container

NVIDIA

aiq-agent

NVIDIA AI-Q Intelligence Agent — an enterprise-grade backend agent built on the NVIDIA NeMo Agent Toolkit, providing quick cited answers and in-depth report-style research with modular multi-agent workflows.

Container

NVIDIA

NVIDIA Data Flywheel Blueprint

Deploy the NVIDIA Data Flywheel Foundational Blueprint on Kubernetes using Helm charts for scalable, production-ready environments.

Helm Chart