SearchSearch thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
NVIDIA Enterprise
NVIDIA Enterprise
NVIDIA NIM
NVIDIA NIM
NIM Container GPUs
NIM Container GPUs
Use Case
Use Case
38
18
8
3
1
NVIDIA Platform
NVIDIA Platform
74
3
Industry
Industry
Solution
Solution
48
38
37
Publisher
Publisher
143
Policy
Policy
Displaying 148 results
Multi-scale Diarization Decoder (MSDD) model for speaker diarization of telephone conversations
Model
HiFiGAN Speech Synthesis model
Model
Conformer-CTC-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
FastPitch Speech Synthesis model trained on female English speech.
Model
This collection contains two models. 1) Multi-speaker 44100Hz FastPitch trained on approximately 20 hours of Latin American Spanish speech from 174 speakers. 2) HiFiGAN trained on mel spectrograms produced by the Multi-speaker FastPitch in (1).
Model
MarbleNet VAD model
Model
Neural Machine Translation (NMT) model to translate from English to Spanish
Model
BERT Large Uncased trained on English Wikipedia and BookCorpus
Model
Neural Machine Translation (NMT) model to translate from English to Simplified Chinese
Model
Citrinet 1024model trained on ASR Set dataset
Model
Tacotron2 Speech Synthesis model trained on female English speech
Model
QuartzNet is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.
Model
SpeakertNet-M model trained with NeMo for speaker verification and speaker embeddings
Model
Speech To Text (STT) model based on QuartzNet for recognizing Spanish speech.
Model
FastPitch+HiFiGAN End-to-End Speech Synthesis model trained on female English speech
Model
Citrinet-1024 model with kernel scaling factor (gamma) of 25%, which has been trained on the open-source Aishell-2 Mandarin Chinese corpus.
Model
Conformer-CTC-Medium model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
Speech To Text (STT) model based on QuartzNet for recognizing Russian speech.
Model
Conformer-CTC-Small model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
BERT Base Uncased trained on English Wikipedia and BookCorpus
Model
Neural Machine Translation (NMT) model to translate from English to Spanish
Model
Citrinet 512 model trained on Aishell-2 Mandarin corpus
Model
MarbleNet VAD model for telephony data
Model