Automatic Speech Recognition (ASR) systems typically generate text with no punctuation and capitalization of the words. This tutorial explains how to train a model that will predict punctuation and capitalization for each word in a sentence, thereby making the ASR output more readable and boosting the performance of the named entity recognition, machine translation or text-to-speech models. In this sample, we start by using a pre-trained BERT model. For every word in our training dataset we’re going to predict:
The best place to get started with TLT - Punctuation and Capitalization would be the TLT - Punctuation and Capitalization jupyter notebook. This resource has two notebooks included.
.ejrvs
file.ejrvs
file and deploy it to Jarvis.If you are a seasoned Conversation AI developer we recommend installing TLT and referring to the TLT documentation for usage information.
Please make sure to install the following before proceeding further:
Note: A compatible NVIDIA GPU would be required.
We recommend that you install TLT inside a virtual environment. The steps to do the same are as follows.
virtualenv -p python3 <name of venv>
source <name of venv>/bin/activate
pip install jupyter notebook # If you need to run the notebooks
TLT is python package that is hosted in nvidia python package index. You may install by using python’s package manager, pip.
pip install nvidia-pyindex
pip install nvidia-tlt
To download the jupyter notebook please:
ngc registry resource download-version "nvidia/tlt-jarvis/punctuationcapitalization_notebook:v1.0"
jupyter notebook --ip 0.0.0.0 --allow-root --port 8888
By downloading and using the models and resources packaged with TLT Conversational AI, you would be accepting the terms of the Jarvis license