NVIDIA NeMo is an open source toolkit for conversational AI. It is built for data scientists and researchers to build new state of the art ASR (Automatic Speech Recognition), NLP(Natural Language Processing) and TTS(Text to speech synthesis) networks easily through API compatible building blocks that can be connected together.
�Neural Modules� are conceptual blocks that take typed inputs and produce typed outputs. NeMo makes it easy to combine and re-use these building blocks while providing a level of semantic correctness checking via its neural type system. Conversational AI architectures are typically very large and require a lot of data and compute for training. Built for speed, NeMo can utilize NVIDIA's Tensor Cores and scale out training to multiple GPUs and multiple nodes.NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Every NeMo model is a LightningModule that comes equipped with all supporting infrastructure for training and reproducibility.
This NGC collection includes ready-to-use models for Automatic Speech Recognition (ASR), Natural Language Processing (NLP) and Speech Synthesis (TTS). Any of these pre-trained models can be used with the NeMo toolkit to build applications that work with domain specific data for speech and nlp. NeMo itself contains the concept of collections of modules where ASR, NLP and TTS modules are separately available inside the toolkit
Several pretrained models in the form of Pytorch checkpoints packaged as nemo files are provided. Models trained with NeMo are high accuracy and trained on multiple datasets. The overview pages in each collection show details of all datasets used and accuracy achieved.
The following Pretrained Models are provided in this collection:
To quickly get started building and training Conversational AI, NeMo provides several Jupyter Notebook examples
Domain | Title | GitHub URL |
---|---|---|
NeMo | Simple Application with NeMo | Voice swap app |
NeMo | Exploring NeMo Fundamentals | NeMo primer |
NeMo Models | Exploring NeMo Model Construction | NeMo models |
ASR | ASR with NeMo | ASR with NeMo |
ASR | Speech Commands | Speech commands |
ASR | Speaker Recognition and Verification | Speaker Recognition and Verification |
ASR | Online Noise Augmentation | Online noise augmentation |
NLP | Using Pretrained Language Models for Downstream Tasks | [Pretrained language models for downstream tasks](https://github/NVIDIA/NeMo/blob/main/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb |
NLP | Exploring NeMo NLP Tokenizers | NLP tokenizers |
NLP | Text Classification (Sentiment Analysis) with BERT | Text Classification (Sentiment Analysis) |
NLP | Question answering with SQuAD | Question answering Squad |
NLP | Token Classification (Named Entity Recognition) | Token classification: named entity recognition |
NLP | GLUE Benchmark | GLUE benchmark |
NLP | Punctuation and Capitialization | Punctuation and capitalization |
NLP | Named Entity Recognition - BioMegatron | Named Entity Recognition - BioMegatron |
NLP | Relation Extraction - BioMegatron | Relation Extraction - BioMegatron |
TTS | Speech Synthesis | TTS inference |
This release updates core training api with Pytorch Lightning. Every NeMo model is a LightningModule that comes equipped with all supporting infrastructure for training and reproducibility. Every NeMo model has an example configuration file and a corresponding script that contains all configurations needed for training.
NeMo, Pytorch Lightning, and Hydra makes all NeMo models have the same look and feel so that it is easy to do Conversational AI research across multiple domains.
New models such as Speaker Identification and Megatron BERT provide variety. Together with the collection and docker container, we believe NeMo is on track to become a premier toolkit for Conversational AI model building and training.
NeMo developer guide is available at https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/index.html
GPUs in the Pascal, Volta, Turing and A100 families
NeMo development is based on NVIDIA's PyTorch container version 20.08-py3
NeMo is licensed under Apache License 2.0 Link Here. By pulling and using the container and models, you accept the terms and conditions of these licenses.
Use the Github Issues forum for questions regarding this Software