NGC | Catalog
CatalogModels
Many AI applications have common needs: classification, object detection, language translation, text-to-speech, recommender engines, sentiment analysis, and more. When developing applications with these capabilities, it is much faster to start with a model that is pre-trained and then tune it for a specific use case. The NGC catalog offers pre-trained models for a variety of common AI tasks that are optimized for NVIDIA Tensor Core GPUs, and can be easily re-trained by updating just a few layers, saving valuable time.
Sort: Last Modified
Logo for RIVA EnglishUS Hifigan
Riva multisepaker with IPA for G2P
Logo for RIVA EnglishUS Fastpitch
Riva multisepaker with IPA for G2P
Logo for DLRM checkpoint (TensorFlow2, TF32, BS64k, Base, FL15)
DLRM TensorFlow2 checkpoint trained on Criteo Dataset with FreqLimit=15 on A100 with TF32
Logo for BERT PaddlePaddle checkpoint (Large, Pretraining, AMP, LAMB)
BERT Large PaddlePaddle checkpoint pretrained with LAMB optimizer using AMP
Logo for BERT PaddlePaddle checkpoint (Large, QA, SQuAD1.1, AMP)
BERT-Large PaddlePaddle checkpoint finetuned for QA on SQuAD v1.1
Logo for Riva ASR Mandarin LM
Base Mandarin 4-gram LM
Logo for Riva ASR Arabic Inverse Normalization Grammar
Logo for Riva ASR Mandarin Inverse Normalization Grammar
Logo for Riva TTS English US Auxiliary Files
Contains files used in rmir creation
This collection contains two models. 1) Multi-speaker 44100Hz FastPitch trained on approximately 20 hours of Latin American Spanish speech from 174 speakers. 2) HiFiGAN trained on mel spectrograms produced by the Multi-speaker FastPitch in (1).
Logo for RIVA Punctuation and Capitalization for Spanish
For each word in the input text, the model: 1) predicts a punctuation mark that should follow the word (if any), the model supports commas, periods and question marks) and 2) predicts if the word should be capitalized or not.
Logo for Riva ASR Korean LM
Base Korean 4-gram LM
Logo for Riva ASR Brazilian Portuguese LM
Base Brazilian Portuguese 4-gram LM
Logo for RIVA Conformer ASR Korean
Korean (ko-KR) Conformer ASR model trained on ASR set 1.0
Logo for RIVA Citrinet-1024 ASR Korean
Korean (ko-KR) Citrinet-1024 ASR model trained on ASR set 1.0
Logo for RIVA Conformer ASR Brazilian Portuguese
Brazilian Portuguese (pt-BR) Conformer ASR model trained on ASR set 1.0
Logo for RIVA Citrinet-1024 ASR Brazilian Portuguese
Brazilian Portuguese (pt-BR) Citrinet-1024 ASR model trained on ASR set 1.0
Logo for STT It Conformer-Transducer Large
Conformer-Transducer-Large model for Italian Automatic Speech Recognition, Trained on NeMo ASRSET 2.0
Logo for STT It Conformer CTC Large
Conformer-CTC-Large model for Italian Automatic Speech Recognition, Trained on NeMo ASRSET 2.0
Logo for STT Ru Conformer-CTC Large
Conformer-CTC-Large model for Russian Automatic Speech Recognition, trained on Mozilla Common Voice 10.0 (Russian), Golos (Russian), Russian LibriSpeech (RuLS) and SOVA (RuAudiobooksDevices, RuDevices) datasets.
Logo for TTS En RAD-TTS Aligner
RAD-TTS Aligner model trained on female English speech.
Logo for STT Ru Conformer-Transducer Large
Conformer-Transducer-Large model for Russian Automatic Speech Recognition, trained on Mozilla Common Voice 10.0 (Russian), Golos (Russian), Russian LibriSpeech (RuLS) and SOVA (RuAudiobooksDevices, RuDevices) datasets.
Logo for Riva ASR French Inverse Normalization Grammar
Logo for TTS En FastPitch
FastPitch Speech Synthesis model trained on female English speech.
Logo for Bi3D Proximity Segmentation
Bi3D is a binary depth classification network that is used to classify the depth of objects at a given distance.