NGC | Catalog
CatalogModels
Models
The NGC catalog offers 100s of pre-trained models for computer vision, speech, recommendation, and more. Bring AI faster to market by using these models as-is or quickly build proprietary models with a fraction of your custom data.
Sort: Last Modified
Logo for RIVA EnglishUS Energy Hifigan
Hifigan model finetuned for energy conditioned multispeaker ipa fastpitch.
Logo for RIVA EnglishUS Fastpitch Energy Conditioned
Riva energy conditioned multisepaker with IPA
Logo for RIVA Conformer ASR Spanish
Spanish Conformer ASR model trained on ASR set 2.0.
Logo for ResNet50 pretrained weights (PyTorch, AMP, ImageNet)
ResNet50 ImageNet pretrained weights.
Logo for PeopleNet
3 class object detection network to detect people in an image.
Logo for TAO Pretrained Object Detection
Pretrained weights to facilitate transfer learning using TAO Toolkit.
Logo for RIVA Conformer ASR Mandarin
Mandarin (zh-CN) Conformer ASR model trained on ASR set 5.0
Logo for RIVA Conformer ASR German
German Conformer ASR model trained on RIVA ASR set
Logo for RIVA Conformer ASR French
French Conformer ASR model trained on RIVA ASR set
Logo for RIVA Conformer ASR English
English (en-GB) Conformer ASR model trained on ASR set 1.0
Logo for StyleGAN3 pretrained models
StyleGAN3 pretrained models for FFHQ, AFHQv2 and MetFaces datasets.
Logo for StyleGAN2 pretrained models
For use with the official StyleGAN3 implementation: https://github.com/NVlabs/stylegan3
Logo for TTS En HiFiTTS VITS
End-to-end parallel speech synthesis model
Logo for Riva ASR German Inverse Normalization Grammar
Logo for Riva ASR Spanish Inverse Normalization Grammar
Logo for Riva ASR English Inverse Normalization Grammar
Logo for Riva ASR French Inverse Normalization Grammar
Logo for RIVA EnglishUS Fastpitch
FastPitch is a mel-spectrogram generator, designed to be used as the first part of a neural text-to-speech system in conjunction with a neural vocoder
Logo for Speech Synthesis English FastPitch
Mel-Spectrogram prediction conditioned on input text with LJSpeech voice.
Logo for Speech Synthesis HiFi-GAN
GAN-based waveform generator from mel-spectrograms.
Logo for RIVA EnglishUS Hifigan
HifiGAN is a neural vocoder model for text-to-speech applications. It is intended as the second part of a two-stage speech synthesis pipeline, with a mel-spectrogram generator such as FastPitch as the first stage.
Logo for RIVA ASR Russian LM
Base Russian n-gram LM
Logo for RIVA EnglishUS RADTTS
Riva multisepaker with IPA for G2P
Logo for RIVA EnglishUS RADTTS Hifigan
Riva multisepaker with IPA for G2P
Logo for RIVA Punctuation