NGC | Catalog
CatalogResourcesSF Bilingual Speech in Chinese and English

SF Bilingual Speech in Chinese and English

For downloads and more information, please view on a desktop device.
Logo for SF Bilingual Speech in Chinese and English


A bilingual (Mandarin-English) Speech Dataset.


Chunghwa Telecom Laboratories

Latest Version



April 4, 2023

Compressed Size

477.68 MB

Get ready-to-use bilingual Chinese and English speech dataset.

This dataset is used for training bilingual (Chinese and English) Text-to-Speech models, including training FastPitch acoustic model with NVIDIA Deep Learning Examples FastPitch training recipe. The dataset contains about 2,740 bilingual audio samples of a single female speaker and their corresponding text transcripts, each of them is an audio of around 5-6 seconds and have a total length of approximately 4.5 hours.

The dataset is provided and shared by Chunghwa Telecom Laboratories. By downloading and using this dataset, you accept the terms and conditions of the license, CC BY-NC 4.0.