SearchSearch thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
NVIDIA Enterprise
NVIDIA Enterprise
NVIDIA NIM
NVIDIA NIM
NIM Container GPUs
NIM Container GPUs
Use Case
Use Case
15
2
1
1
1
NVIDIA Platform
NVIDIA Platform
11
3
1
Industry
Industry
1
1
Solution
Solution
11
5
5
1
Publisher
Publisher
16
1
Policy
Policy
Displaying 18 results
The Domain Specific - NeMo Automatic Speech Recognition (ASR) Application facilitates training, evaluation and performance comparison of ASR models. This NeMo application enables you to train or fine-tune pre-trained ASR models with your own data.
Container
MeetKai Inc.
MK-SQuIT
SQuIT (Synthesizing Questions using Iterative Template-Filling) is a generated dataset produced with little human intervention. This container provides several tutorial applications - an interactive dataset explorer, a walkthrough of the generation pipeline, and a demonstration using NeMo to fine tune and evaluate a model on the dataset.
Container
This model card contains a Small Audio Codec model trained on the Libri-Light audiobook recordings dataset, comprising approximately 60,000 hours of English language speech with a 16kHz sampling rate.
Model
NVIDIA
NVIDIA
TitaNet-S
TitaNet Small model for Speaker Verification and Diarization tasks
Model
PearlNet Lang ID model for Spoken Language Identification
Model
Fast Conformer-CTC-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
Large size version of hybrid Fast Conformer TDT-CTC 114M parameter model trained on larger dataset of 36000 hrs with Punctuation and Capitalization. This model is jointly developed by NVIDIA NeMo and Suno.ai teams.
Model
This collection contains the large version (114M) of the English speech recognition model with a FastConformer encoder and a Hybrid decoder (joint RNNT-CTC loss). The model has a vocab size of 1024 and emits text with punctuation and capitalization.
Model
English + Mandarin Multilingual and Code-Switched Speech Recognition FastConformer Transducer Large Model
Model
Fast Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
This collection contains the large version (114M) of the Persian speech recognition model with a FastConformer encoder and a Hybrid decoder (joint RNNT-CTC loss). The model has a vocab size of 1024.
Model
This collection contains FastPitch and Spectrogram Enhancer models. Main use case is English ASR domain fine-tuning. Direct TTS use is not advised.
Model
Fast Conformer-Transducer-XXLarge model for English Automatic Speech Recognition, trained on NeMo ASRSET 3.0
Model
Fast Conformer-CTC-XLarge model for English Automatic Speech Recognition, Trained on NeMo ASRSET
Model
Fast Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained on NeMo ASRSET 3.0
Model
Fast Conformer-Transducer-Large model for English Automatic Speech Recognition, Trained with NeMo on LibriSpeech dataset
Model
Fast Conformer-CTC-XXLarge model for English Automatic Speech Recognition, Pre-trained on LibriLight and fine-tuned on NeMo ASRSET 3.0
Model
An 8 million parameter BERT model fully pre-trained with BioNeMo
Model