Model
Japanese Conformer ASR model trained on RIVA ASR set
Sign in to access this content
| Field | Response |
|---|---|
| Intended Applications & Domains: | Speech Transcription |
| Types: | Speech Transcription |
| Intended Users: | Data Scientists in Contact Center Transcription, Video Conferencing Transcription, Virtual Assistants, etc |
| Output: | Transcribed text with timestamps and confidence scores |
| Describe how the model works: | Model transcribes audio input into text for the input language |
| Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of: | Age, Gender, National Origin |
| Technical Limitations: | Transcripts may not be 100% accurate. Accuracy varies based on the characteristics of input audio (Domain, Use Case, Accent, Noise, Speech Type, Context of speech, etc.) |
| Verified to have met prescribed NVIDIA quality standards: | Yes |
| Performance Metrics: | Character Error Rate (CER), Silence Robustness (Characters/mins of silent audio), Latency (in milliseconds), Throughput (Total audio processed per unit of time) |
| Potential Known Risks: | Not recommended for word-for-word transcription as accuracy varies based on the characteristics of input audio (domain, use case, accent, noise, speech type, and context of speech) |
| Licensing: | https://developer.nvidia.com/riva/ga/license |