SearchSearch thousands of GPU-optimized Containers, pretrained Models, SDKs, and Helm charts—ready to accelerate AI, digital twins, and HPC from cloud to edge.
NVIDIA Enterprise
NVIDIA Enterprise
3
3
2
NVIDIA NIM
NVIDIA NIM
3
NIM Container GPUs
NIM Container GPUs
1
1
1
1
Use Case
Use Case
26
6
4
3
2
2
2
1
1
1
1
1
NVIDIA Platform
NVIDIA Platform
7
4
3
3
2
1
1
1
1
1
Industry
Industry
2
1
Solution
Solution
14
7
5
3
2
1
1
Publisher
Publisher
23
8
Policy
Policy
Displaying 36 results
Riva Speech Skills is a scalable Conversational AI service platform.
Container
NVIDIA AI Enterprise
RIVA TTS NIM provide easy access to state-of-the-art text to speech models, capable of synthesizing English speech from text
Container
Scripts and utilities for getting started with Riva Speech Skills
Resource
NVIDIA Developer Program
RIVA TTS NIM provide easy access to state-of-the-art text to speech models, capable of synthesizing English speech from text
Container
Contains files used in rmir creation
Model
NVIDIA Developer Program
Chatterbox TTS Multilingual NIM Container
Container
WaveGlow model weights pre-trained on the LJ Speech dataset to be used with https://github.com/NVIDIA/waveglow.
Model
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
HiFi-GAN PyT checkpoint (22kHz, AMP)
HiFi-GAN v1 PyTorch checkpoint trained on 8GPU with AMP on LJSpeech-1.1 (22kHz).
Model
NVIDIA
NVIDIA
WaveGlow
WaveGlow is a flow-based network capable of generating high quality speech from mel-spectrograms.
Model
End to End workflow for text to speech training with TAO Toolkit and deployment using Riva.
Resource
HiFi-GAN v1 PyTorch checkpoint trained on 8GPU with AMP on LJSpeech-1.1 (22kHz), fine-tuned on FastPitch outputs.
Model
Model checkpoints for the Tacotron 2 model trained with NeMo.
Model
Mel-Spectrogram prediction conditioned on input text with LJSpeech voice.
Model
GAN-based waveform generator from mel-spectrograms.
Model
This model card contains a Small Audio Codec model trained on the Libri-Light audiobook recordings dataset, comprising approximately 60,000 hours of English language speech with a 16kHz sampling rate.
Model
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
HiFi-GAN for PyTorch
HiFi-GAN model implements a spectrogram inversion model that allows to synthesize speech waveforms from mel-spectrograms.
Resource
Model checkpoints for the WaveGlow model trained with NeMo.
Model
Riva NeMo-MagpieTTS Multilingual IPA multispeaker model with Emotions
Model
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
Tacotron2 PyTorch checkpoint (AMP)
Tacotron2 PyTorch checkpoint trained with AMP
Model
NVIDIA
NVIDIA
Flowtron
Flowtron is an Autoregressive Flow-based Network for Text-to-Mel-spectrogram Synthesis.
Model
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
Tacotron2 and Waveglow 2.0 for PyTorch
The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user to synthesise a natural sounding speech from raw transcripts.
Resource
FastPitch PyTorch checkpoint trained on LJSpeech-1.1
Model
Universal waveform generator from mel-spectrograms.
Model
22.05kHz full-band Mel Codec model trained on multi-lingual speech.
Model

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.