NGC Catalog
Explore
Search
Support
API Catalog
Forum
Search
Containers
DeepSeek-R1
NVIDIA Developer Program
+1
Llama-3.1-Nemotron-70B-Instruct
NVIDIA Developer Program
+1
PyTorch
Collections
Omniverse Kit (FB)
NVIDIA AI Enterprise
+2
DeepStream SDK
Omniverse Kit App Streaming
NVIDIA AI Enterprise
+2
Models
StyleGAN3 pretrained models
PeopleNet
TrafficCamNet
Resources
Kit SDK - Windows (PB25h1)
NVIDIA AI Enterprise
+2
Kit SDK - Linux (PB25h1)
NVIDIA AI Enterprise
+2
Riva Skills Quick Start
Helm Charts
GPU Operator
NVIDIA NIM Operator
Welcome Guest
Setup
Terms of Use
Theme
Use System Settings
Light
Dark
Sign In / Sign Up
Search
Search thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
Search
Container (0)
Collection (0)
Model (148)
Resource (0)
Helm Chart (0)
NVIDIA Enterprise
(0)
NVIDIA Enterprise
NVIDIA Enterprise
NVIDIA NIM
(0)
NVIDIA NIM
NVIDIA NIM
Accelerate custom generative AI app deployment using pre-built containers with optimized AI models.
NIM Container GPUs
(0)
NIM Container GPUs
NIM Container GPUs
Use Case
(0)
Use Case
Use Case
Automatic Speech Recognition
38
Natural Language Processing
18
Question Answering
8
Natural Language Understanding
3
Language Modeling
1
NVIDIA Platform
(0)
NVIDIA Platform
NVIDIA Platform
NeMo
74
PyTorch
3
Industry
(0)
Industry
Industry
Solution
(0)
Solution
Solution
DL
48
Conversational AI
38
AI
37
Publisher
(0)
Publisher
Publisher
Nvidia
143
Policy
(0)
Policy
Policy
Displaying 148 results
Sort: Most Popular
Sort: Most Popular
Sort: Relevance
Sort: Most Popular
Sort: Last Updated
Sort: Alphabetical (A-Z)
Sort: Alphabetical (Z-A)
Sort: Relevance
Sort: Most Popular
Sort: Last Updated
Sort: Alphabetical (A-Z)
Sort: Alphabetical (Z-A)
Search
PyTorchLightning
label: PyTorchLightning
Clear Filters
NVIDIA
Diarization MSDD Telephonic
Multi-scale Diarization Decoder (MSDD) model for speaker diarization of telephone conversations
AI
Automatic Speech Recognition
+1
Conversational AI
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
TTS Vocoder Hifigan
HiFiGAN Speech Synthesis model
NeMo
Model
18mo
Updated
11/27/2024 UTC
NVIDIA
STT En Conformer-CTC Large
Conformer-CTC-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
TTS En FastPitch
FastPitch Speech Synthesis model trained on female English speech.
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
TTS Es Multispeaker FastPitch HiFiGAN
This collection contains two models. 1) Multi-speaker 44100Hz FastPitch trained on approximately 20 hours of Latin American Spanish speech from 174 speakers. 2) HiFiGAN trained on mel spectrograms produced by the Multi-speaker FastPitch in (1).
AI
DL
+2
NeMo
Spanish
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
VAD Marblenet
MarbleNet VAD model
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
NMT En Es Transformer12x2
Neural Machine Translation (NMT) model to translate from English to Spanish
AI
DL
+2
NeMo
PyTorch
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
Bertlargeuncased
BERT Large Uncased trained on English Wikipedia and BookCorpus
Natural Language Processing
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
NMT En Zh Transformer24x6
Neural Machine Translation (NMT) model to translate from English to Simplified Chinese
AI
Conversational AI
+3
DL
Natural Language Processing
PyTorch
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT En Citrinet 1024
Citrinet 1024model trained on ASR Set dataset
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
TTS En Tacotron2
Tacotron2 Speech Synthesis model trained on female English speech
DL
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT Zh Quartznet15x5
QuartzNet is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
SpeakerVerification Speakernet
SpeakertNet-M model trained with NeMo for speaker verification and speaker embeddings
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT Es Quartznet15x5
Speech To Text (STT) model based on QuartzNet for recognizing Spanish speech.
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
TTS En E2E FastPitch Hifigan
FastPitch+HiFiGAN End-to-End Speech Synthesis model trained on female English speech
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT Zh Citrinet 1024 Gamma 0.25
Citrinet-1024 model with kernel scaling factor (gamma) of 25%, which has been trained on the open-source Aishell-2 Mandarin Chinese corpus.
Automatic Speech Recognition
Conversational AI
+2
DL
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT En Conformer-CTC Medium
Conformer-CTC-Medium model for English Automatic Speech Recognition, Trained on NeMo ASRSET
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT En Conformer-Transducer Large
Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT Ru Quartznet15x5
Speech To Text (STT) model based on QuartzNet for recognizing Russian speech.
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT En Conformer-CTC Small
Conformer-CTC-Small model for English Automatic Speech Recognition, Trained on NeMo ASRSET
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
Bertbaseuncased
BERT Base Uncased trained on English Wikipedia and BookCorpus
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
NMT En Es Transformer24x6
Neural Machine Translation (NMT) model to translate from English to Spanish
AI
Conversational AI
+2
DL
Natural Language Processing
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
STT Zh Citrinet 512
Citrinet 512 model trained on Aishell-2 Mandarin corpus
NeMo
Model
>3y
Updated
04/04/2023 UTC
NVIDIA
VAD telephony Marblenet
MarbleNet VAD model for telephony data
NeMo
Model
>3y
Updated
04/04/2023 UTC
24
Select item
24
48
96
192
24
48
96
192
1-24 of 148 items
1
1
2
2
3
3
4
4
5
5
6
6
7
7
π