NGC | Catalog
Welcome Guest
CatalogModels
Many AI applications have common needs: classification, object detection, language translation, text-to-speech, recommender engines, sentiment analysis, and more. When developing applications with these capabilities, it is much faster to start with a model that is pre-trained and then tune it for a specific use case. The NGC catalog offers pre-trained models for a variety of common AI tasks that are optimized for NVIDIA Tensor Core GPUs, and can be easily re-trained by updating just a few layers, saving valuable time.
Sort: Last Modified
Logo for TTS Vocoder Waveglow 88M
TTS Vocoder Waveglow 88M
Model
WaveGlow Speech Synthesis model with 88M Parameters
Logo for TTS Vocoder Waveglow 268M
TTS Vocoder Waveglow 268M
Model
WaveGlow Speech Synthesis model with 268M Parameters
Logo for TTS Vocoder Uniglow
TTS Vocoder Uniglow
Model
UniGlow Speech Synthesis model
Logo for TTS Vocoder Squeezewave
TTS Vocoder Squeezewave
Model
SqueezeWave Speech Synthesis model
Logo for TTS Vocoder Melgan
TTS Vocoder Melgan
Model
MelGAN Speech Synthesis model
Logo for TTS Vocoder Hifigan
TTS Vocoder Hifigan
Model
HiFiGAN Speech Synthesis model
Logo for TTS En TalkNet
TTS En TalkNet
Model
Speech Synthesis model trained on female English speech
Logo for TTS En Tacotron2
TTS En Tacotron2
Model
Tacotron2 Speech Synthesis model trained on female English speech
Logo for TTS En LJ UnivNet
TTS En LJ UnivNet
Model
UnivNet speech synthesis model trained on female English speech (LJSpeech dataset)
Logo for TTS En LJ Mixer-TTS-X
TTS En LJ Mixer-TTS-X
Model
Mixer-TTS Speech Synthesis model trained on female English speech (LJSpeech dataset)
Logo for TTS En LJ Mixer-TTS
TTS En LJ Mixer-TTS
Model
Mixer-TTS Speech Synthesis model trained on female English speech (LJSpeech dataset)
Logo for TTS En LJ HiFi-GAN
TTS En LJ HiFi-GAN
Model
HiFiGAN Speech Synthesis model trained female English speech (LJSpeech dataset)
Logo for TTS En LibriTTS UnivNet
TTS En LibriTTS UnivNet
Model
UnivNet speech synthesis model trained on English speech (LibriTTS dataset)
Logo for TTS En Glowtts
TTS En Glowtts
Model
Glowtts Speech Synthesis model trained on female English speech
Logo for TTS En FastSpeech 2
TTS En FastSpeech 2
Model
FastSpeech 2 speech synthesis model trained on female English speech
Logo for TTS En FastPitch
TTS En FastPitch
Model
FastPitch Speech Synthesis model trained on female English speech
Logo for TTS En E2E FastPitch Hifigan
TTS En E2E FastPitch Hifigan
Model
FastPitch+HiFiGAN End-to-End Speech Synthesis model trained on female English speech
Logo for TTS En E2E Fastspeech2 Hifigan
TTS En E2E Fastspeech2 Hifigan
Model
FastSpeech2+HiFiGAN End-to-End Speech Synthesis model trained on female English speech
Logo for TTS De FastPitch HiFi-GAN
TTS De FastPitch HiFi-GAN
Model
This collection contains two models: 1) FastPitch (around 50M parameters) trained on OpenSLR neutral German dataset with over 23 hours of German speech and 1 speaker. 2) HiFi-GAN trained on mel spectrograms produced by the FastPitch model in (1).
Logo for TTS DE Multi-Speaker FastPitch HiFiGAN
TTS DE Multi-Speaker FastPitch HiFiGAN
Model
This collection includes two German models: FastPitch trained on the HUI-Audio-Corpus-German clean dataset where the 5-largest amount of speakers are selected and balanced; HiFiGAN is trained on mel-spectrograms predicted by the Multi-speaker FastPitch.
Logo for RIVA EnglishUS Hifigan
RIVA EnglishUS Hifigan
Model
Logo for RIVA EnglishUS Fastpitch
RIVA EnglishUS Fastpitch
Model
Logo for Riva Marblenet Voice Activity Detection
Riva Marblenet Voice Activity Detection
Model
Riva Marblenet Voice Activity Detection
Logo for STT Kab Conformer-Transducer Large
STT Kab Conformer-Transducer Large
Model
Conformer-Transducer-Large model for Kabyle Automatic Speech Recognition, trained on Mozilla Common Voice 10.0 Kabyle train set.
Logo for Riva ASR Spanish Inverse Normalization Grammar
Riva ASR Spanish Inverse Normalization Grammar
Model
Base Spanish grammar