NGC | Catalog
Welcome Guest
CatalogModels
Many AI applications have common needs: classification, object detection, language translation, text-to-speech, recommender engines, sentiment analysis, and more. When developing applications with these capabilities, it is much faster to start with a model that is pre-trained and then tune it for a specific use case. The NGC catalog offers pre-trained models for a variety of common AI tasks that are optimized for NVIDIA Tensor Core GPUs, and can be easily re-trained by updating just a few layers, saving valuable time.
Sort: Last Modified
Logo for ESS DNN Stereo Disparity
ESS DNN Stereo Disparity
Model
ESS is a DNN that estimates disparity for a stereo image pair and returns a continuous disparity map for the given left image.
Logo for Bi3D Proximity Segmentation
Bi3D Proximity Segmentation
Model
Bi3D is a binary depth classification network that is used to classify the depth of objects at a given distance.
Logo for TTS En Multispeaker FastPitch HiFiGAN
TTS En Multispeaker FastPitch HiFiGAN
Model
This collection contains two models: 1) Multi-speaker FastPitch (around 50M parameters) trained on HiFiTTS with over 291.6 hours of english speech and 10 speakers. 2) HiFiGAN trained on mel spectrograms produced by the Multi-speaker FastPitch in (1).
Logo for STT En Conformer-CTC Large
STT En Conformer-CTC Large
Model
Conformer-CTC-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Logo for STT En Conformer-Transducer Large
STT En Conformer-Transducer Large
Model
Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Logo for NER En Bert
NER En Bert
Model
Named Entity Recognition model with BERT
Logo for PeopleSemSegnet
PeopleSemSegnet
Model
Semantic segmentation of persons in an image.
Logo for Riva ASR English Inverse Normalization Grammar
Riva ASR English Inverse Normalization Grammar
Model
Base English grammar
Logo for DashCamNet
DashCamNet
Model
4 class object detection network to detect cars in an image.
Logo for RIVA Conformer ASR Hindi
RIVA Conformer ASR Hindi
Model
Hindi Conformer ASR model trained on ASR set 1.0
Logo for Riva ASR Russian LM
Riva ASR Russian LM
Model
Base Russian 4-gram LM
Logo for English Tagger-based Inverse Text Normalization
English Tagger-based Inverse Text Normalization
Model
English single-pass tagger-based model for inverse text normalization based on bert-base-uncased, trained on 2 mln sentences from Google Text Normalization Dataset, achieves 3.75% WER on Google default test set
Logo for Efficient Geometry-aware 3D Generative Adversarial Networks
Efficient Geometry-aware 3D Generative Adversarial Networks
Model
Pretrained EG3D Models for FFHQ, AFHQ, and Shapenet Cars
Logo for TAO Pretrained Object Detection
TAO Pretrained Object Detection
Model
Pretrained weights to facilitate transfer learning using TAO Toolkit.
Logo for SSL En  Conformer Large
SSL En Conformer Large
Model
Self-Supervised Learning (SSL) checkpoints for Conformer Large model. These are similar to w2v-Conformer model and can be fine-tuned for Automatic Speech Recognition (ASR).
Logo for STT En Conformer-CTC XLarge
STT En Conformer-CTC XLarge
Model
Conformer-CTC-XLarge model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Logo for STT En Conformer-Transducer XLarge
STT En Conformer-Transducer XLarge
Model
Conformer-Transducer-XLarge model for English Automatic Speech Recognition, trained on NeMo ASRSET
Logo for SSL En  Conformer XLarge
SSL En Conformer XLarge
Model
Self-Supervised Learning (SSL) checkpoints for Conformer XLarge model. These are similar to w2v-Conformer model and can be fine-tuned for Automatic Speech Recognition (ASR).
Logo for PeopleNet
PeopleNet
Model
3 class object detection network to detect people in an image.
Logo for Riva ASR Mandarin LM
Riva ASR Mandarin LM
Model
Base Mandarin 4-gram LM
Logo for TrafficCamNet
TrafficCamNet
Model
4 class object detection network to detect cars in an image.
Logo for PeopleSegNet
PeopleSegNet
Model
1 class instance segmentation network to detect and segment instances of people in an image.
Logo for LPDNet
LPDNet
Model
Object Detection network to detect license plates in an image of a car.
Logo for Riva TTS English US Auxiliary Files
Riva TTS English US Auxiliary Files
Model
Contains files used in rmir creation
Logo for Riva TTS English Normalization Grammar
Riva TTS English Normalization Grammar
Model
Base English grammar