Many AI applications have common needs: classification, object detection, language translation, text-to-speech, recommender engines, sentiment analysis, and more. When developing applications with these capabilities, it is much faster to start with a model that is pre-trained and then tune it for a specific use case. The NGC catalog offers pre-trained models for a variety of common AI tasks that are optimized for NVIDIA Tensor Core GPUs, and can be easily re-trained by updating just a few layers, saving valuable time.
Sort: Last Modified
RIVA Conformer ASR Mandarin
Model
Mandarin (zh-CN) Conformer ASR model trained on ASR set 2.0
RIVA Conformer ASR Russian - ASR set 1.0
Model
Rusian Conformer ASR model trained on ASR set 1.0
TTS En Multispeaker FastPitch HiFiGAN
Model
This collection contains two models:
1) Multi-speaker FastPitch (around 50M parameters) trained on HiFiTTS with over 291.6 hours of english speech and 10 speakers.
2) HiFiGAN trained on mel spectrograms produced by the Multi-speaker FastPitch in (1).
TTS En RAD-TTS Aligner
Model
RAD-TTS Aligner model trained on female English speech.
STT Hr Conformer-CTC Large
Model
Conformer-CTC-Large model for Croatian automatic speech recognition, trained on ParlaSpeech-HR v1.0 dataset.
STT Hr Conformer-Transducer Large
Model
Conformer-Transducer-Large model for Croatian automatic speech recognition, trained on ParlaSpeech-HR v1.0 dataset.
RIVA Conformer ASR English - ASR set 3.0
Model
English Conformer ASR model trained on ASR set 3.0
RIVA Conformer ASR French
Model
French (fr-FR) Conformer ASR model trained on ASR set 2.0
RIVA Conformer ASR English
Model
English (en-GB) Conformer ASR model trained on ASR set 1.0
STT En Es Multilingual Code-Switched Conformer Transducer Large
Model
English + Spanish Multilingual and Code-Switched Speech Recognition Conformer Transducer Large Model
STT Ca Conformer-Transducer Large
Model
Conformer-Transducer-Large model for Catalan automatic speech recognition, Trained on MCV-9.0 dataset.
STT Ca Conformer-CTC Large
Model
Conformer-CTC-Large model for Catalan automatic speech recognition, trained on MCV 9.0 dataset.
STT Rw Conformer-CTC Large
Model
Conformer-CTC-Large model for Kinyarwanda Automatic Speech Recognition, trained on Mozilla Common Voice 9.0 Kinyarwanda dataset.
STT Rw Conformer-Transducer Large
Model
Conformer-Transducer-Large model for Kinyarwanda Automatic Speech Recognition, trained on Mozilla Common Voice 9.0 Kinyarwanda dataset.
RIVA Punctuation and Capitalization for French
Model
For each word in the input text, the model: 1) predicts a punctuation mark that should follow the word (if any), the model supports commas, periods, hyphens and question marks) and 2) predicts if the word should be capitalized or not.
RIVA Punctuation and Capitalization for Hindi
Model
For each word in the input text, the model predicts a punctuation mark that should follow the word (if any), the model supports commas, poornvirams, exclaimation marks and question marks.
Russian Tagger-based Inverse Text Normalization
Model
Russian single-pass tagger-based model for inverse text normalization based on BERT encoder, trained on 2 mln sentences from Google Text Normalization Dataset, achieves 3.55% WER on Google default test set
Riva ASR Hindi LM
Model
Base Hindi 3-gram LM
RIVA Citrinet ASR Hindi (hi-IN) - ASR set 1.0
Model
Hindi Citrinet ASR model trained on ASR set 1.0
RIVA Conformer ASR Hindi - ASR set 2.0
Model
Hindi Conformer ASR model trained on ASR set 2.0
Riva ASR French LM
Model
Base French 4-gram LM
Riva ASR English(en-GB) LM
Model
Base English 3-gram LM
Riva ASR Mandarin LM
Model
Base Mandarin 4-gram LM
Riva ASR English(en-US) LM
ModelQuick Deploy
Base English n-gram LM trained on LibriSpeech, Switchboard and Fisher
STT Hi Conformer-CTC Large
Model
Conformer-CTC-Large model for Hinglish Automatic Speech Recognition, trained on ULCA & Europal dataset.