SF Bilingual Speech in Chinese and English

Chunghwa Telecom Laboratories

Resource

Chunghwa Telecom Laboratories

SF Bilingual Speech in Chinese and English

A bilingual (Mandarin-English) Speech Dataset.

Get ready-to-use bilingual Chinese and English speech dataset.

This dataset is used for training bilingual (Chinese and English) Text-to-Speech models, including training FastPitch acoustic model with NVIDIA Deep Learning Examples FastPitch training recipe. The dataset contains about 2,740 bilingual audio samples of a single female speaker and their corresponding text transcripts, each of them is an audio of around 5-6 seconds and have a total length of approximately 4.5 hours.

The dataset is provided and shared by Chunghwa Telecom Laboratories. By downloading and using this dataset, you accept the terms and conditions of the license, CC BY-NC 4.0.

Publisher

Chunghwa Telecom Laboratories

Latest Versionv1

UpdatedApril 4, 2023 UTC

Compressed Size477.68 MB

Labels

AI Audio Synthesis bilingual Conversational AI dataset DL en Natural Language Processing