Model
Mandarin (zh-CN) Parakeet-CTC-XL-0.6B ASR model trained on ASR set 3.0
Sign in to access this content
| Field | Response |
|---|---|
| Intended Applications & Domains: | Speech Transcription |
| Types: | Speech Transcription |
| Intended Users: | This model is intended for developers and data scientists building interactive call centers, virtual assistants, and language learning assistants |
| Output: | Transcribed text with timestamps and confidence scores |
| Describe how the model works: | Model transcribes audio input into text for the input language |
| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Age, Gender, National Origin |
| Technical Limitations: | Transcripts may not be 100% accurate. Accuracy varies based on the characteristics of input audio (Domain, Use Case, Accent, Noise, Speech Type, Context of speech, etc.) |
| Verified to have met prescribed NVIDIA quality standards: | Yes |
| Performance Metrics: | Word Error Rate (WER), Silence Robustness (Characters/mins of silent audio), Latency (in milliseconds), Throughput (Total audio processed per unit of time) |
| Potential Known Risks: | Not recommended for word-for-word transcription as accuracy varies based on the characteristics of input audio (domain, use case, accent, noise, speech type, and context of speech) |
| Licensing: | https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/ |