Riva TTS Spanish US FastPitch | NVIDIA NGC

NVIDIA

Riva TTS Spanish US FastPitch

Model

NVIDIA

Riva TTS Spanish US FastPitch

Spanish US FastPitch model

Field	Response
Intended Application & Domain:	Speech Synthesis
Model Task	Speech Synthesis and Generative Adversarial Network
Intended Users	This model is intended for developers building interactive call centers, virtual assistants, language learning assistants to improve pronunciation, automatically generate voice-overs, narrate or comment on videos, and provide audio alternatives for visually impaired users or people with light sensitivity.
Model Output	Audio files (.wav)
How the model works	Model transcribes input text characters into audio representation.
Technical Limitations	Model only has the capacity to produce a voice in the language, dialect and gender(s) in which it is trained. This model makes no effort to moderate or modify input text.
Performance Metrics	% preference when compared with available alternatives Pitch (mean) Pitch_standard deviation (std) (mean) Pitch_kurtosis (mean) Pitch_skew (mean) Fundamental Frequency Ratio (f0_ratio) (mean) f0_ratio_std (mean) f0_ratio_kurtosis (mean) f0_ratio_skew (mean) Pitch (median) Pitch_std (median) Pitch_kurtosis (median) Pitch_skew (median) f0_ratio (median) f0_ratio_std (median) f0_ratio_kurtosis (median) f0_ratio_skew (median)"
Potential Known Risks	May unnaturally synthesize vocabulary not included in the pronunciation dictionary or omit phonetic symbols not used in training.
Licensing:	https://docs.nvidia.com/ai-foundation-models-community-license.pdf