NVIDIA
NVIDIA
Text to Speech Notebook
Resource
NVIDIA
NVIDIA
Text to Speech Notebook

End to End workflow for text to speech training with TAO Toolkit and deployment using Riva.

Text to Speech

TTS, Text-To-Speech or Speech Synthesis refers to the problem of getting a program to generate human voice output output from text. TAO Toolkit supports a two-stage pipeline for TTS:

  1. A spectrogram model to generate a Mel spectrogram from text (FastPitch)
  2. A vocoder model to generate audio from a Mel spectrogram (HiFiGAN)

Our goal here is to generate FastPitch and HiFiGAN model, that when cascaded generates a good quality human voice from text.

The best place to get started with TAO Toolkit - TTS would be the TAO - TTS jupyter notebooks sample enclosed in this sample. This resource has three notebooks included.

  1. Training: Sample workflow for training FastPitch spectrogram generator model and a HiFiGAN vocoder model and export them to a .riva file
  2. Finetuning: Sample workflow to finetune a FastPitch spectrogram generator model and a HiFiGAN vocoder model from another pretrained FastPitch and HiFiGAN model.
  3. Deployment: Sample workflow to consume the .riva files and deploy it to Riva.

If you are a seasoned Conversation AI developer we recommend installing TAO and referring to the TAO documentation for detailed information.

Pre-Requisites

Please make sure to install the following before proceeding further:

  • python 3.6.9
  • python-dev
  • docker-ce > 19.03.5
  • docker-API 1.40
  • nvidia-container-toolkit > 1.3.0-1
  • nvidia-container-runtime > 3.4.0-1
  • nvidia-docker2 > 2.5.0-1
  • nvidia-driver >= 455.23

Note: A compatible NVIDIA GPU would be required.

Installation

We recommend that you install TAO Toolkit inside a virtual environment. The steps to do the same are as follows

virtualenv -p python3 <name of venv>
source <name of venv>/bin/activate
pip install jupyter notebook # If you need to run the notebooks

TAO Toolkit is a python package that is hosted in nvidia python package index. You may install by using python’s package manager, pip.

pip install nvidia-pyindex
pip install nvidia-tao

To download the jupyter notebook please:

  1. Download the samples using the ngc cli with the following command

    ngc registry resource download-version "nvidia/tao/texttospeech_notebook:v1.1"
    
  2. Instantiate the jupyter notebook server

    jupyter notebook --ip 0.0.0.0 --allow-root --port 8888
    

License

By downloading and using the models and resources packaged with TAO Toolkit Conversational AI, you would be accepting the terms of the Riva license

Publisher
NVIDIA
NVIDIA
Latest Versionv1.3.0
UpdatedApril 4, 2023 UTC
Compressed Size168.4 KB

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.