This is a cased dialogue state tracking model with a BERT  Base encoder finetuned on dataset Google SGD . The model is based on the architecture presented in "SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services" paper .
SGD-QA is a multi-pass NLU model consisting of a BERT  encoder with multi-task heads. We use a pre-trained BERT model with four classification heads and one span prediction head for slot value extraction. The SGD-QA model relies on a question answering approach and uses schema description as query and the dialogue turn as context. Each input instance is intended for exactly one task which is implemented by masking out the other task's losses. Intent prediction, requested slot prediction and categorical slot value prediction are formulated as binary sequence classification. The slot status prediction task is a 3-way sequence classification to predict whether a slot is active, non-active ("none") or does not matter to the user ("dontcare"). Multiple slots can become active in a single turn. Noncategorical slot value extraction extracts the values for non-categorical slots by using a span-based prediction head, similar to SQuAD, where two token classification layers detect the start and end positions of the slot value in the dialogue turn. Like the categorical slot value prediction task, its prediction is only used when the slot status is active.
The model is trained on the full dataset Google SGD . We used NVIDIA DGX1 with 8 V100 GPUs. Hyperparameters can be found in .
The Google SGD dataset  is the biggest dataset for goal-oriented dialogue state tracking with over 20k annotated dialogues for 45 services spanning 20 domains. The SGD dataset defines an ontology, called schema, that contains descriptions in natural language for all entities associated with a particular service. Slots are further classified into non-categorical slots and categorical slots. For categorical slots, the schema also includes a list of possible values. The user-system dialogues can be either single-domain or multi-domain, where a user can request two or more services per dialogue. Each turn is labeled with relevant schema information, called dialogue state, comprising of the active intent, requested user slots and slot assignments that occurred throughout the dialogue. The SGD dataset is designed to test services beyond those seen at training: 57% of the dev and 78% of the test dataset stem from with unseen services. Nevertheless, seen and unseen services can share similar slots and functionality
The latest model in the version history has the following joint goal accuracy which measures the accuracy of predicting all slot assignments for a turn correctly.
ALL SERVICES 59.72
SEEN SERVICES 65.7
UNSEEN SERVICES 51.96
ALL SERVICES 45.85
SEEN SERVICES 49.44
UNSEEN SERVICES 44.65
import nemo import nemo.collections.nlp as nemo_nlp model = nemo_nlp.models.dialogue_state_tracking.sgdqa_model.SGDQAModel.from_pretrained(model_name="sgdqa_bertbasecased")
python [NEMO_GIT_FOLDER]/examples/nlp/dialogue_state_tracking/sgd_qa.py do_training=false pretrained_model=sgdqa_bertbasecased model.dataset.data_dir=[EXISTING_DATA_DIR] model.dataset.dialogues_example_dir=[PATH_TO_PREPROCESSED_DATA] model.dataset.task_name=sgd_all trainer.gpus=1 model.test_ds.ds_item=["test"]
PATH_TO_PREPROCESSED_DATA does not exist, data will be preprocessed from
EXISTING_DATA_DIR. If exists and user wants to load from cache for better performance set additionally
The schema and dialogue json files in their original format
dialogue state predictions during evaluation can be dumped as json files
Since this model was trained on publically available datasets, the performance of this model might degrade for custom data that the model has not been trained on.