Finetune Mistral 7B Using Quick Deploy

In this notebook, we will use NVIDIA's NeMo Framework to finetune the Mistral 7B LLM. Finetuning can be done using brev quick deploy option.
April 24, 2024
In this notebook, we will use NVIDIA's NeMo Framework to finetune the Mistral 7B LLM. Finetuning is the process of adjusting the weights of a pre-trained foundation model with custom data. Considering that foundation models can be significantly large, a variant of fine-tuning has gained traction recently, known as parameter-efficient fine-tuning (PEFT). PEFT encompasses several methods, including P-Tuning, LoRA, Adapters, and IA3. For those interested in a deeper understanding of these methods, we have included a list of additional resources below.

To streamline your experience and jump directly into a GPU-accelerated environment with this notebook and NeMo pre-installed, click the badge below. Our 1-click deploys are powered by

Getting started

Use the 1-click deploy link above to set up a machine with NeMO installed. Once the VM is ready, use the Access Notebook button to enter the Jupyter Lab instance


For this notebook, we use the Mistral-7B parameter model and the NeMo framework. We will be finetuning on the PubMedQA dataset and training our model to respond with simple yes/no answer. PubMedQA is a novel biomedical question answering (QA) dataset collected from PubMed abstracts.


NVIDIA NeMo Framework is a generative AI framework built for researchers and pytorch developers working on large language models (LLMs), multimodal models (MM), automatic speech recognition (ASR), and text-to-speech synthesis (TTS). NeMO provides a scalable framework to easily design, implement, and scale new AI models using existing pre-trained models and a simple API for configuration.