In this notebook, we will use NVIDIA's NeMo framework to finetune the Mistral 7B LLM. Finetuning is the process of adjusting the weights of a pre-trained foundation model with custom data. Considering that foundation models can be significantly large, a variant of fine-tuning has gained traction recently, known as parameter-efficient fine-tuning (PEFT). PEFT encompasses several methods, including P-Tuning, LoRA, Adapters, and IA3. For those interested in a deeper understanding of these methods, we have included a list of additional resources below.
To streamline your experience and jump directly into a GPU-accelerated environment with this notebook and NeMo pre-installed, click the badge below. Our 1-click deploys are powered by Brev.dev.
Use the 1-click deploy link above to set up a machine with NeMo installed. Once the VM is ready, use the Access Notebook button to enter the Jupyter Lab instance
For this notebook, we use the Mistral-7B parameter model and the NeMo framework. We will be finetuning on the PubMedQA dataset and training our model to respond with simple yes/no answer. PubMedQA is a novel biomedical question answering (QA) dataset collected from PubMed abstracts.
NVIDIA NeMo framework is a generative AI framework built for researchers and pytorch developers working on large language models (LLMs), multimodal models (MM), automatic speech recognition (ASR), and text-to-speech synthesis (TTS). NeMo provides a scalable framework to easily design, implement, and scale new AI models using existing pre-trained models and a simple API for configuration.