NVIDIA

TAO Toolkit - Conversational AI

Collection

NVIDIA

TAO Toolkit - Conversational AI

TAO Toolkit is a python based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data.

What is Train Adapt Optimize (TAO) Toolkit?

Train Adapt Optimize (TAO) Toolkit is a python based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data. TAO adapts popular network architectures and backbones to your data, allowing you to train, fine-tune, and export highly optimized and accurate AI models for deployment.

The pre-trained models accelerate the AI training process and reduce costs associated with large scale data collection, labeling, and training models from scratch.

Build end-to-end services and solutions for conversational AI using TAO and Riva. TAO can train models for common conversational AI tasks such as text classification, question answering, speech recognition, and others.

Purpose-built Pre-Trained Models

Purpose-built pre-trained models offer highly accurate AI for a variety of conversational AI tasks. Developers, system builders, and software partners building conversational AI applications can bring their own custom data to train and fine-tune with using these models, instead of going through the hassle of building a large data collection and training from scratch.

The purpose-built models are available on NGC. Under each model card, there is a version that can be deployed as is and a version which can be used with TAO to fine-tune with your own dataset.

Models here can be used for Automatic Speech Recognition (ASR), Question Answering (QA), Domain Classification, Named Entity Recognition (NER), Punctuation and Capitalization, and Joint Intent and Slot Classification. See each model card for information about how to fine-tune and evaluate on your data.

Model		Accuracy Metric(s)
ASR: Jasper (English)	3.74%/10.21% WER (LibriSpeech dev-clean/dev-other)	English speech recognition
ASR: QuartzNet (English)	4.38%/11.30% WER (LibriSpeech dev-clean/dev-other)	English speech recognition (smaller model)
ASR: Citrinet (English)		English speech recognition
QA: Bert Base (SQuAD2.0)	73.35% EM score, 76.44 F1 score	Question answering
QA: Bert Large (SQuAD2.0)	77.16% EM score, 80.22% F1 score	Question answering
QA: Bert Megatron (SQuAD2.0)	78.0% EM score, 81.35% F1 score	Question answering
Domain Classification: BERT	90% accuracy for 4 domains of the weather chatbot	Text classification problems (e.g. sentiment analysis, domain detection)
Punctuation and Capitalization: BERT	77% F1 score	Punctuation and capitalization of ASR output
NER: BERT	74.21% F1 score	Named entity recognition and other token-level classification tasks
Joint Intent and Slot Classification: BERT	95% intent accuracy, 93% slot accuracy	Classifying intent and detecting relevant slots in a query

Running TAO Toolkit

Setup your python environment using python virtualenv and virtualenvwrapper.
In TAO Toolkit, we have created an abstraction above the container, you will launch all your training jobs from the launcher. No need to manually pull the appropriate container, tao-launcher will handle that. You may install the launcher using pip with the following commands.
```
pip3 install nvidia-pyindex
pip3 install nvidia-tao
```
Download one of the Jupyter notebooks that you are interested in from NGC resources. For each task, there is a training notebook as well as a deployment notebook. After installing the pre-requisites, all the training/deployment steps will be run from inside the Jupyter notebook.

Conversational AI Task	Jupyter Notebooks
Automatic Speech Recognition	Resources
Automatic Speech Recognition - Citrinet	Resources
Question Answering	Resources
Text Classification	Resources
Named Entity Recognition	Resources
Punctuation and Capitalization	Resources
Intent and Slot Classification	Resources
N-Gram Lanugage Model	Resources
Speech Synthesis or Text to Speech	Resources

Using TAO Pre-trained Models

Pre-trained models for each Conversational AI task can be found under their respective collections here:

License

By pulling and using the Transfer Learning Tookit for Conversational AI container, you accept the terms and conditions of this license. By downloading and using the models and resources packaged with TAO Conversational AI, you would be accepting the terms of the Riva license.

Technical blogs

Learn how to build and deploy Conversational AI models using TAO

Ethical AI

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.

Publisher

NVIDIA

UpdatedMarch 14, 2025 UTC

Labels

Finetuning Inference Transfer Learning

What is Train Adapt Optimize (TAO) Toolkit?

Purpose-built Pre-Trained Models

Running TAO Toolkit

Using TAO Pre-trained Models

License

Technical blogs

Suggested reading

Ethical AI