NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

ZeroShotIntentModel trained by fine tuning megatron-bert-345m-uncased on the MNLI (Multi-Genre Natural Language Inference) dataset, which achieves an accuracy of 90.0% and 89.9% on the matched and mismatched dev sets, respectively.

Publisher

NVIDIA

Latest Version

1.4.1

Modified

April 4, 2023

Size

1.15 GB

Model Overview

This model can be used for recognizing the intent of a query in English.

Model Architecture

This model consists of a pretrained megatron-bert-345m-uncased model [1] followed by a 2-layer sequence classification head.

Training

The NeMo toolkit [2] was used for training this model for two epochs.

Dataset

The model was trained on the MNLI (Mult-Genre Natural Language Inference) dataset [3] from: https://dl.fbaipublicfiles.com/glue/data/MNLI.zip.

Performance

The performance of the model was tested on the MNLI dev sets. MNLI contains two dev sets, matched and mismatched, which contain genres seen or not seen during training, respectively. This model achieves an accuracy of 90.0% and 89.9% on the matched and mismatched dev sets, respectively.

How to Use this Model

The model is available for use in the NeMo toolkit [2], and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

Automatically load the model from NGC

import nemo.collections.nlp as nemo_nlp
model = nemo_nlp.models.ZeroShotIntentModel.from_pretrained(model_name="zeroshotintent_en_megatron_uncased")

Recognizing intents with this model

queries = [
 "What is the weather in Santa Clara tomorrow morning?",
 "I'd like a veggie burger and fries",
 "Play the latest Taylor Swift album"
]

candidate_labels = ['Food order', 'Weather query', "Play music"]

predictions = model.predict(queries, candidate_labels)

Input

The predict method of the model accepts two lists of strings. The first list is the list of queries to be classified. The second list is the list of candidate labels.

Output

The predict method returns a list of dictionaries containing one dictionary per input query. Each dictionary has keys "sentence", "labels", and "scores". "sentence" contains the input query, and "labels" and "scores" are parallel lists (with each score corresponding to the label at the same index), sorted from highest to lowest score.

Limitations

No known limitations at this time.

References

[1] Shoeybi, M. et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

[2] NVIDIA Nemo Toolkit

[3] Williams, A. et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

License

License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the TERMS OF USE.

ZeroShotIntent En Megatronuncased