NVIDIA AI - End-to-End AI Development & Deployment

NGC Catalog

CLASSIC

Welcome Guest

For contents of this collection and more information, please view on a desktop device.

Description

This is a collection of performance-optimized frameworks, SDKs, and models to build Computer Vision and Speech AI applications.

Curator

NVIDIA

Modified

March 14, 2025

Containers

Helm Charts

Models

Resources

This collection provides performance-optimized frameworks, SDKs, and pre-trained models for AI practitioners to develop and deploy their solutions on any GPU-accelerated on-prem, cloud, and edge systems.

AI Tools Included in this Collection

NVIDIA releases a new version every month for many of the NVIDIA built AI software, updated with optimized libraries, delivering higher training and inference performance on the same GPU-powered system.

Deep Learning Frameworks: Updated monthly, PyTorch and TensorFlow containers are optimized for GPU acceleration, and contain a validated set of libraries that enable and optimize GPU performance. These containers also contain software for accelerating ETL, training, and inference.
RAPIDS: Accelerates end-to-end data science and analytics pipelines entirely on GPUs.
TensorRT: Takes a trained network and produces a highly optimized runtime engine that performs inference for that network.
TAO: A python-based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data. Add all 3 TAO containers in the entities tab.
Triton: An open-source software to deploy trained AI models from any framework, on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.
DeepStream: This SDK delivers a complete streaming analytics toolkit for AI based video and image understanding and multi-sensor processing.
RIVA: A GPU-accelerated SDK for building speech applications that are customized for your use case and deliver real-time performance. Include RIVA Clients and RIVIA Speech skills

Pre-trained Models Included in this Collection

We’ve also included a few pre-trained models for computer vision and speech for you to get started. See the Models page for the full list of models across vision, speech, healthcare, and more.

Computer Vision models

PeopleNet: Three class object detection network to detect people in an image. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplenet
ActionRecognitionNet: 5 class network to recognize what people are doing in an image. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/actionrecognitionnet
TafficCamNet: Four class object detection network to detect cars in an image. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/trafficcamnet

Speech Models

English Citrinet ASR: Model trained on ASR set 3.0 to transcribe segments of audio to text. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechtotext_en_us_citrinet
Conformer-CTC: Transcribes speech in lower case English alphabet along with spaces and apostrophes. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_medium
FastPitch: Speech synthesis model trained on female English speech. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/tts_en_fastpitch

License

By pulling and using the containers and models, you accept the terms and conditions of the applicable End User License Agreement.