NGC | Catalog
Welcome Guest
CatalogModelsBART PyT checkpoint (Summarization, XSum)

BART PyT checkpoint (Summarization, XSum)

For downloads and more information, please view on a desktop device.
Logo for BART PyT checkpoint (Summarization, XSum)

Description

BART PyT checkpoint for summarization on XSum dataset

Publisher

NVIDIA Deep Learning Examples

Use Case

Language Modeling

Framework

PyTorch

Latest Version

20.11.0

Modified

October 29, 2021

Size

4.14 GB

Model Overview

BART is a denoising autoencoder for pretraining sequence-to-sequence models.

Model Architecture

BART uses a standard sequence-to-sequence Transformer architecture with GeLU activations. The base model consists of 6 layers in encoder and decoder, whereas large consists of 12. The architecture has roughly 10% more parameters than BERT.

BART is trained by corrupting documents and then optimizing the reconstruction loss. The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.

Training

This model was trained using script available on NGC and in GitHub repo

Dataset

The following datasets were used to train this model:

  • Extreme Summarization - Dataset consisting of 226,711 Wayback archived BBC articles ranging over almost a decade (2010 to 2017) and covering a wide variety of domains (e.g., News, Politics, Sports, Weather, Business, Technology, Science, Health, Family, Education, Entertainment and Arts).

Performance

Performance numbers for this model are available in NGC

References

License

This model was trained using open-source software available in Deep Learning Examples repository. For terms of use, please refer to the license of the script and the datasets the model was derived from.