NGC Catalog
CLASSIC
Welcome Guest
Models
Mistral-Nemo-Minitron-2B-128k-Instruct

Mistral-Nemo-Minitron-2B-128k-Instruct

For downloads and more information, please view on a desktop device.
Logo for Mistral-Nemo-Minitron-2B-128k-Instruct
Associated Products
Features
Description
Generates responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling.
Publisher
NVIDIA
Latest Version
1.0.0
Modified
December 18, 2024
Size
5.83 GB

Model Overview

Description:

Mistral-NeMo-Minitron-2B-128K-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of a 2B Base model that was pruned and distilled from nvidia/Mistral-NeMo-Minitron-8B-Base using our LLM compression technique. The model was trained using a multi-stage Supervised Fine-tuning (SFT) and preference-based alignment technique with NeMo Aligner. For details on the alignment technique, please refer to the Nemotron-4 340B Technical Report. The model supports a context length of 131,072 tokens. This model is ready for commercial use.

License/Terms of Use:

NVIDIA Open Model License

Model Architecture:

Architecture Type: Transformer
Network Architecture: Decoder-only

Input:

Input Type(s): Text (Prompt)
Input Format(s): String
Input Parameters: One Dimensional (1D)
Other Properties Related to Input: The model has a maximum of 131,072 input tokens.

Output:

Output Type(s): Text (Response)
Output Format: String
Output Parameters: 1D
Other Properties Related to Output: The model has a maximum of 131,072 input tokens. Maximum output for both versions can be set apart from input.

Prompt Format:

We recommend using the following prompt template, which was used to fine-tune the model. The model may not perform optimally without it.

<s>System
{system prompt}</s>
<s>User
{user prompt}</s>
<s>Assistant\n
  • Note that a newline character (\n) should be added after <s>Assistant as a generation prompt

Evaluation Results

Category Benchmark # Shots Mistral-NeMo-Minitron-8B-128K-Instruct
General MMLU 5 58.9
MT Bench (GPT4-Turbo) 0 6.69
Math GMS8K 0 75.4
Reasoning GPQA (Main) 0 30.4
MUSR 0 35.7
Code HumanEval 0 45.7
MBPP 0 57.5
Instruction Following IFEval 0 74.0
Tool Use BFCL v2 Live 0 59.1

Software Integration: (Cloud)

Runtime Engine: NeMo Framework 24.09

Supported Hardware Microarchitecture Compatibility:

  • [NVIDIA Ampere]
  • [NVIDIA Blackwell]
  • [NVIDIA Hopper]
  • [NVIDIA Lovelace]

[Preferred/Supported] Operating System(s):

  • Linux

Model Version(s)

Mistral-NeMo-Minitron 2B-128K Instruct

Training & Evaluation:

Training Dataset:

** Data Collection Method by dataset

  • Hybrid: Automated, Human

** Labeling Method by dataset

  • Hybrid: Automated, Human

Evaluation Dataset:

** Data Collection Method by dataset

  • Hybrid: Automated, Human

** Labeling Method by dataset

  • Human

Inference:

Engine: TRT-LLM
Test Hardware:

  • A100
  • A10G
  • H100
  • L40S

Supported Hardware Platform(s): L40S, A10G, A100, H100

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.

Please report security vulnerabilities or NVIDIA AI Concerns here.