NGC Catalog
CLASSIC
Welcome Guest
Models
Mistral-7B-Instruct-v0.2

Mistral-7B-Instruct-v0.2

For downloads and more information, please view on a desktop device.
Logo for Mistral-7B-Instruct-v0.2
Description
Mistral-7B-v0.2 is an instruction-tuned generative text model developed by Mistral AI.
Publisher
Mistral AI
Latest Version
1.0
Modified
November 27, 2024
Size
26.98 GB

Redistribution Information

NVIDIA Validated

  • Supported Runtime(s): TensorRT-LLM
  • Supported Hardware(s): Ampere, Hopper
  • Supported OS(s): Linux

Terms of Use: By using this model, you are agreeing to the terms and conditions of the license.

Mistral 7B Instruct v0.2

Mistral-7B-v0.2 is an instruction-tuned generative text model developed by Mistral AI. Model details can be found here. This model is optimized through NVIDIA NeMo Framework, and is provided through a .nemo checkpoint.

Benefits of using Mistral-7B Instruct v0.2 checkpoints in NeMo Framework

While the examples below refer to Mistral-7B v0.1, they are directly compatible with Mistral-7B Instruct v0.2.

P-Tuning and LoRA

NeMo Framework offers support for various parameter-efficient fine-tuning (PEFT) methods for Mistral-7B Instruct v0.2.

PEFT techniques allow customizing foundation models to improve performance on specific tasks.

Two of them, P-Tuning and Low-Rank Adaptation (LoRA), are supported out of the box for Mistral-7B Instruct v0.2 and have been described in detail in NeMo Framework user guide, showing how to tune Mistral-7B-v0.1 to answer biomedical questions based on PubMedQA.

Supervised Fine-tuning

NeMo Framework offers Supervised fine-tuning (SFT) support for Mistral-7B Instruct v0.2.

Fine-tuning refers to how one can modify the weights of a pre-trained foundation model with additional custom data. Supervised fine-tuning (SFT) refers to unfreezing all the weights and layers in tuned model and training on a newly labeled set of examples. One can fine-tune to incorporate new, domain-specific knowledge or teach the foundation model what type of response to provide.

NeMo Framework offers out-of-the-box SFT support for Mistral-7B Instruct v0.2, which has been described in detail for Mistral-7B v0.1 in NeMo Framework user guide, but is directly compatible with Mistral-7B Instruct v0.2.

Optimized Deployment with TensorRT-LLM

Using TensorRT-LLM, NeMo Framework allows exporting Mistral-7B Instruct v0.2 checkpoints to formats that are optimized for deployment on NVIDIA GPUs.

TensorRT-LLM is a library that allows building TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

Thanks to that, it is possible to reach state-of-the-art performance using export and deployment methods that NVIDIA built for Mistral-7B Instruct v0.2. This process has been described in detail in NeMo Framework user guide.

Detailed information, including performance results are available here.