Mistral-7B-Instruct-v0.3

NGC Catalog

CLASSIC

Welcome Guest

For downloads and more information, please view on a desktop device.

Description

Mistral-7B-v0.3 is an instruction-tuned generative text model developed by Mistral AI.

Publisher

Mistral AI

Latest Version

1.0

Modified

November 13, 2024

Size

13.5 GB

Redistribution Information

Supported Runtime(s): TensorRT-LLM Supported Hardware(s): Ampere, Hopper Supported OS(s): Linux Terms of Use: By using this model, you are agreeing to the terms and conditions of the license.

Mistral 7B Instruct v0.3

Mistral-7B-v0.3 is an instruction-tuned generative text model developed by Mistral AI. Model details can be found here. This model is optimized through NVIDIA NeMo Framework, and is provided through a .nemo checkpoint.

Benefits of using Mistral-7B Instruct v0.3 checkpoints in NeMo Framework

While the examples below refer to Mistral-7B v0.1, they are directly compatible with Mistral-7B Instruct v0.3.

P-Tuning and LoRA

NeMo Framework offers support for various parameter-efficient fine-tuning (PEFT) methods for Mistral-7B Instruct v0.3.

PEFT techniques allow customizing foundation models to improve performance on specific tasks.

Two of them, P-Tuning and Low-Rank Adaptation (LoRA), are supported out of the box for Mistral-7B Instruct v0.3 and have been described in detail in NeMo Framework user guide, showing how to tune Mistral-7B-v0.1 to answer biomedical questions based on PubMedQA.

Supervised Fine-tuning

NeMo Framework offers Supervised fine-tuning (SFT) support for Mistral-7B Instruct v0.3.

Fine-tuning refers to how one can modify the weights of a pre-trained foundation model with additional custom data. Supervised fine-tuning (SFT) refers to unfreezing all the weights and layers in tuned model and training on a newly labeled set of examples. One can fine-tune to incorporate new, domain-specific knowledge or teach the foundation model what type of response to provide.

NeMo Framework offers out-of-the-box SFT support for Mistral-7B Instruct v0.3, which has been described in detail for Mistral-7B v0.1 in NeMo Framework user guide, but is directly compatible with Mistral-7B Instruct v0.3.

Optimized Deployment with TensorRT-LLM

Using TensorRT-LLM, NeMo Framework allows exporting Mistral-7B Instruct v0.3 checkpoints to formats that are optimized for deployment on NVIDIA GPUs.

TensorRT-LLM is a library that allows building TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

Thanks to that, it is possible to reach state-of-the-art performance using export and deployment methods that NVIDIA built for Mistral-7B Instruct v0.3. This process has been described in detail in NeMo Framework user guide.

Detailed information, including performance results are available here.