NGC Catalog
CLASSIC
Welcome Guest
Models
Meta Llama 3 8B

Meta Llama 3 8B

For downloads and more information, please view on a desktop device.
Logo for Meta Llama 3 8B
Description
Meta Llama 3 8B is a pretrained decoder-only, text-to-text base model. It was trained on 15 trillion tokens of data from publicly available sources by Meta.
Publisher
Meta
Latest Version
1.0
Modified
July 2, 2024
Size
14.96 GB

Meta Llama 3 8B

Model architecture|Model size

Meta Llama 3 8B Terms of Use

This model is licensed under the Meta Llama 3 Community License Agreement.

Model Description

Meta-Llama-3-8B is a pretrained decoder-only, text-to-text base model. It was trained on 15 trillion tokens of data from publicly available sources.

Meta-Llama-3-8B builds on the existing work done for the Llama, and Llama 2, family of models. It makes a few key improvements:

  • 128K token vocabulary, allowing for language to be encoded more efficiently
  • Grouped-Query Attention (GQA) for all sizes of the model
  • 8,192 context window for training with cross-document masking

In their development of these models, Meta took great care to optimize for helpfulness and safety.

More details on the model can be found here.

This model is optimized through NVIDIA NeMo Framework, and is provided through a .nemo checkpoint.

Getting Started

Step 1: Fetching the model file

  1. You can download the model from NGC using the CLI tool.
ngc registry model download-version nvidia/nemo/llama_3_8B:1.0
  1. (Optional, Recommended) Pre-extract the model from the .nemo file. This will reduce the model loading time overhead because the NeMo scripts will not have to extract the model from the .nemo everytime they are run. This step is not required though, and NeMo scripts can still ingest the .nemo file directly as well.
mkdir llama_3_8B
tar -xvf TODO.nemo -C llama_3_8B

You can find the instructions to install and configure the NGC CLI tool here

Step 2: Fetch the NeMo Framework container

You can use the NeMo Framework container available on NGC which comes preloaded with all the required dependencies.

export NEMO_IMAGE="nvcr.io/nvidia/nemo:24.05"
docker pull $NEMO_IMAGE

Step 3: Run the NeMo container

Run the NeMo framework container, mounting the model.

# Full path to the extracted model directory
export MODEL_DIR=$(pwd)/llama_3_8B
docker run --gpus all -it --rm  -v ${MODEL_DIR}:/model $NEMO_IMAGE

Step 4: Follow The Desired Playbooks (Coming Soon!)

We're hard at work preparing resources to enable you to most easily use the Meta-Llama-3-8B .nemo checkpoint!