Model
The large version (114M) of the Multilingual speech recognition model with a FastConformer encoder and a Hybrid decoder (joint RNNT-CTC loss). The model has a vocab size of 2560 and emits text with punctuation and capitalization.
Use the NGC CLI to download:
Copied!
| Name | Size | Updated | Actions |
|---|---|---|---|
stt_multilingual_fastconformer_hybrid_large_pc.nemo | 450.64 MB | March 1, 2025 UTC |