NGC | Catalog
Welcome Guest
CatalogModelsBioMegatron345m-biovocab-30k-cased

BioMegatron345m-biovocab-30k-cased

For downloads and more information, please view on a desktop device.
Logo for BioMegatron345m-biovocab-30k-cased

Description

Megatron-LM 345m parameters model with biomedical vocabulary (30k size) cased, pre-trained on PubMed biomedical text corpus.

Publisher

-

Use Case

Natural Language Processing

Framework

PyTorch

Latest Version

1

Modified

October 28, 2020

Size

1.25 GB

Overview

This is a checkpoint for BioMegatron 345m with biomedical domain vocabulary (30k size), cased.

Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA, which was trained with multinode and using mixed precision. Unlike BERT, the position of the layer normalization and the residual connection in the model architecture (similar to GPT-2 architucture) are swapped, which allowed the models to continue to improve as they were scaled up. This model reaches higher scores compared to BERT on a range of Natural Language Processing (NLP) tasks. BioMegatron has the same network architecture as the Megatron, but is pretrained on a different dataset - PubMed, a large biomedical text corpus, which achieves better performance in biomedical downstream tasks than the original Megatron.

It contains

  • model_optim_rng.pt - pre-trained Megatron model weights
  • pubmed_merged-all-cased.vocab.txt - biomedical vocabulary file (30k size, cased) used for train this checkpoint

More details about the model can be found in the BioMegatron paper: https://arxiv.org/abs/2010.06060

Documentation

Source code and developer guide is available at https://github.com/NVIDIA/NeMo and https://github.com/NVIDIA/Megatron-LM

This model checkpoint can be used for finetuning on biomedical question answering datasets, such as named entity recognition(NER), question answering (QA), or relationship extraction (RE).

In the following we show examples for how to finetune BioMegatron on different downstream tasks.

Usage example 1: Finetune on Named Entity Recognition

https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Token_Classification-BioMegatron.ipynb

Usage example 2: Finetune on Relation Extraction

https://github.com/NVIDIA/NeMo/blob/main/tutorials/nlp/Relation_Extraction-BioMegatron.ipynb