GPU-optimized AI, Machine Learning, & HPC Software

QuartzNet is a Jasper-like network that uses separable convolutions and larger filter sizes. It has comparable accuracy to Jasper while having much fewer parameters. This particular model has 15 blocks each repeated 5 times.

Model

NVIDIA

STT En Conformer-CTC XLarge

Conformer-CTC-XLarge model for English Automatic Speech Recognition, Trained on NeMo ASRSET

Model

NVIDIA

STT En Conformer-CTC Large LibriSpeech

Conformer-CTC-Large model for English Automatic Speech Recognition, Trained with NeMo on LibriSpeech dataset

Model

NVIDIA

ASR Language Modeling Transformer Large LibriSpeech

Transformer-Large language model for English ASR, Trained on LibriSpeech text corpus with NeMo

Model

NVIDIA

TTS En FastSpeech 2

FastSpeech 2 speech synthesis model trained on female English speech

Model

NVIDIA

STT En Conformer-Transducer Small

Conformer-Transducer-Small model for English Automatic Speech Recognition, Trained on NeMo ASRSET

Model

NVIDIA

Megatron Multilingual En Any 500M

Megatron Multilingual Neural Machine Translation model to translate from English to Any* language Supported languages: cs, da, de, el, es, fi, fr, hu, it, lt, lv, nl, no, pl, pt, ro, ru, sk, sv, zh, ja, hi, ko, et, sl, bg, uk, hr, ar, vi, tr, id

Model

NVIDIA

Punctuation En Bert

Punctuation and Capitalization model with BERT

Model

NVIDIA

STT En FastConformer Hybrid Transducer-CTC Large Streaming 80ms

This collection contains the large version (114M) of the streaming speech recognition model trained on NeMo ASRSET for English with look-ahead of 80ms. All models are cache-aware hybrid FastConformer with both Transducer and CTC decoders.

Model

NVIDIA

NER En Bert

Named Entity Recognition model with BERT

Model