STT Multilingual FastConformer Hybrid Transducer-CTC Large P&C

NVIDIA

Model

NVIDIA

STT Multilingual FastConformer Hybrid Transducer-CTC Large P&C

The large version (114M) of the Multilingual speech recognition model with a FastConformer encoder and a Hybrid decoder (joint RNNT-CTC loss). The model has a vocab size of 2560 and emits text with punctuation and capitalization.

Name	Size	Updated	Actions
stt_multilingual_fastconformer_hybrid_large_pc.nemo	450.64 MB	March 1, 2025 UTC	curl wget Download