NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

End to End workflow for punctuation and capitalization starting with training in TLT and deployment using Jarvis.

Publisher

NVIDIA

Latest Version

v1.0

Modified

April 4, 2023

Compressed Size

38.7 KB

Punctuation and Capitalization

Automatic Speech Recognition (ASR) systems typically generate text with no punctuation and capitalization of the words. This tutorial explains how to train a model that will predict punctuation and capitalization for each word in a sentence, thereby making the ASR output more readable and boosting the performance of the named entity recognition, machine translation or text-to-speech models. In this sample, we start by using a pre-trained BERT model. For every word in our training dataset we’re going to predict:

Punctuation mark that should follow the words.
Whether the word should be capitalized.

The best place to get started with TLT - Punctuation and Capitalization would be the TLT - Punctuation and Capitalization jupyter notebook. This resource has two notebooks included.

Training: Sample workflow for training a punctuation capitalization model and export the model to a .ejrvs file
Deployment: Sample workflow to consume the .ejrvs file and deploy it to Jarvis.

If you are a seasoned Conversation AI developer we recommend installing TLT and referring to the TLT documentation for usage information.

Pre-Requisites

Please make sure to install the following before proceeding further:

python 3.6.9
docker-ce > 19.03.5
docker-API 1.40
nvidia-container-toolkit > 1.3.0-1
nvidia-container-runtime > 3.4.0-1
nvidia-docker2 > 2.5.0-1
nvidia-driver >= 455.23

Note: A compatible NVIDIA GPU would be required.

Installation

We recommend that you install TLT inside a virtual environment. The steps to do the same are as follows.

virtualenv -p python3 <name of venv>
source <name of venv>/bin/activate
pip install jupyter notebook # If you need to run the notebooks

TLT is python package that is hosted in nvidia python package index. You may install by using python’s package manager, pip.

pip install nvidia-pyindex
pip install nvidia-tlt

To download the jupyter notebook please:

Download the samples using the ngc cli with the following command

ngc registry resource download-version "nvidia/tlt-jarvis/punctuationcapitalization_notebook:v1.0"

Instantiate the jupyter notebook server

jupyter notebook --ip 0.0.0.0 --allow-root --port 8888

License

By downloading and using the models and resources packaged with TLT Conversational AI, you would be accepting the terms of the Jarvis license

Punctuation Capitalization Notebook

Punctuation and Capitalization

Pre-Requisites

Installation

License