NGC | Catalog
Welcome Guest
CatalogModelsZeroShotIntent En Megatronuncased

ZeroShotIntent En Megatronuncased

For downloads and more information, please view on a desktop device.
Logo for ZeroShotIntent En Megatronuncased


ZeroShotIntentModel trained by fine tuning megatron-bert-345m-uncased on the MNLI (Multi-Genre Natural Language Inference) dataset, which achieves an accuracy of 90.0% and 89.9% on the matched and mismatched dev sets, respectively.



Use Case




Latest Version



October 7, 2021


1.15 GB

Model Overview

This model can be used for recognizing the intent of a query in English.

Model Architecture

This model consists of a pretrained megatron-bert-345m-uncased model [1] followed by a 2-layer sequence classification head.


The NeMo toolkit [2] was used for training this model for two epochs.


The model was trained on the MNLI (Mult-Genre Natural Language Inference) dataset [3] from:


The performance of the model was tested on the MNLI dev sets. MNLI contains two dev sets, matched and mismatched, which contain genres seen or not seen during training, respectively. This model achieves an accuracy of 90.0% and 89.9% on the matched and mismatched dev sets, respectively.

How to Use this Model

The model is available for use in the NeMo toolkit [2], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Automatically load the model from NGC

import nemo.collections.nlp as nemo_nlp
model = nemo_nlp.models.ZeroShotIntentModel.from_pretrained(model_name="zeroshotintent_en_megatron_uncased") 

Recognizing intents with this model

queries = [
 "What is the weather in Santa Clara tomorrow morning?",
 "I'd like a veggie burger and fries",
 "Play the latest Taylor Swift album"

candidate_labels = ['Food order', 'Weather query', "Play music"]

predictions = model.predict(queries, candidate_labels)


The predict method of the model accepts two lists of strings. The first list is the list of queries to be classified. The second list is the list of candidate labels.


The predict method returns a list of dictionaries containing one dictionary per input query. Each dictionary has keys "sentence", "labels", and "scores". "sentence" contains the input query, and "labels" and "scores" are parallel lists (with each score corresponding to the label at the same index), sorted from highest to lowest score.


No known limitations at this time.


[1] Shoeybi, M. et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

[2] NVIDIA Nemo Toolkit

[3] Williams, A. et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference


License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the TERMS OF USE.