FinMegatron345m-gpt2-bpe

NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

FSI : Financial Megatron GPT2 345m parameters model with BPE tokenizer, gpt vocabulary and merge file, pre-trained on subsets of CC-100 text corpus.

Publisher

NVIDIA NeMo

Latest Version

Modified

April 4, 2023

Size

1.32 GB

Overview

This is a nemo file for FSI Financial Megatron GPT2 345m with uncased BPE tokenizer.

Please be sure to download the latest version in order to ensure compatibility with the latest NeMo release.

Model Architecture

NeMo Megatron is a new capability in the NeMo framework that allows developers to effectively train and scale language models to billions of parameters.

This 345m papameter model has 24 layers (Transformer blocks), 1024 hidden-units, and 16 attention heads.

For more information about NeMo Megatron visit https://github.com/NVIDIA/NeMo

Dataset

This model was trained on text sourced from financial related documents from Wikipedia, RealNews, OpenWebText, and CC-Stories.

How to use this Model

NVIDIA NeMo can be used for text generation and prompt/p-tuning. Tutorial notebooks on p-tuning the model for multiple nlp tasks can be found on the tutorials page of NeMo.

NVIDIA NeMo can be used for easy fine-tuning to a number of different tasks. The usage of Financial megatron model can refer to other domains alike: Tutorial notebooks on fine-tuning the model for Named Entity Recognition, Relation Extraction can be found on the tutorials page of NeMo.

Source code and developer guide is available at https://github.com/NVIDIA/NeMo Refer to documentation at https://docs.nvidia.com/deeplearning/nemo/neural-modules-release-notes/index.html

Limitations

No known limitations available at this time.

Licence

License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the NGC TERMS OF USE.