NGC | Catalog
CatalogModelsMegatron GPT2 345M

Megatron GPT2 345M

Logo for Megatron GPT2 345M
345M parameter GPT generative Megatron model
Latest Version
April 4, 2023
1.32 GB

Megatron-LM GPT2 345M

Megatron is a large, powerful transformer. For this particular Megatron model we trained a generative, left-to-right transformer in the style of GPT-2. This model contains 345 million parameters made up of 24 layers, 16 attention heads, and a hidden size of 1024.

This model was trained on text sourced from Wikipedia, RealNews, OpenWebText, and CC-Stories.

For more information about NeMo Megatron visit

How to use this Model

NVIDIA NeMo can be used for text generation and prompt/p-tuning. Tutorial notebooks on p-tuning the model for multiple nlp tasks can be found on the tutorials page of NeMo.

Source code and developer guide is available at Refer to documentation at


No known limitations available at this time.


  1. P-tuning: An Effective Prompt Engineering Method to Significantly Improve the Performance of Your Large NLP Model (


License to use this model is covered by the NGC TERMS OF USE unless another License/Terms Of Use/EULA is clearly specified. By downloading the public and release version of the model, you accept the terms and conditions of the NGC TERMS OF USE.