NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
Electra-Base checkpoint (TensorFlow2, AMP, Squad1.0, seqLen384)
Model
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
Electra-Base checkpoint (TensorFlow2, AMP, Squad1.0, seqLen384)

Electra-Base TensorFlow2 checkpoint finetuned on Squad1.1 using seqLen=384

Model Overview

ELECTRA is method of pre-training language representations which outperforms existing techniques on a wide array of NLP tasks.

Model Architecture

ELECTRA is a combination of two Transformer models: a generator and a discriminator. The generator's role is to replace tokens in a sequence, and is therefore trained as a masked language model. The discriminator, which is the model we are interested in, tries to identify which tokens were replaced by the generator in the sequence. Both generator and discriminator use the same architecture as the encoder of the Transformer. The encoder is simply a stack of Transformer blocks, which consist of a multi-head attention layer followed by successive stages of feed-forward networks and layer normalization. The multi-head attention layer performs self-attention on multiple input representations.

Figure 1-1

Training

This model was trained using script available on NGC and in GitHub repo.

Dataset

The following datasets were used to train this model:

  • SQuAD 1.1 + 2.0 - Reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

Performance

Performance numbers for this model are available in NGC.

References

License

This model was trained using open-source software available in Deep Learning Examples repository.
For terms of use, please refer to the license of the script and the datasets the model was derived from.

Publisher
NVIDIA Deep Learning Examples
NVIDIA Deep Learning Examples
Latest Version20.07.0_amp
UpdatedApril 4, 2023 UTC
Compressed Size1.23 GB
Labels

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.