# Model Overview ## Description: [//]: # ([Provide additional details about the algorithm/model; include supporting image/video and/or reference blog/article, if available.] [This model is ready for commercial/non-commercial use.] OR [This model is for research and development only.] OR [This model is for demonstration purposes and not for production usage.]
) Mistral-NeMo-Minitron-2B-128K-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of a 2B Base model that was pruned and distilled from [nvidia/Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base) using [our LLM compression technique](https://arxiv.org/abs/2407.14679). The model was trained using a multi-stage Supervised Fine-tuning (SFT) and preference-based alignment technique with [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner). For details on the alignment technique, please refer to the [Nemotron-4 340B Technical Report](https://arxiv.org/abs/2406.11704). The model supports a context length of 131,072 tokens. This model is ready for commercial use. ### License/Terms of Use: [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf) ## Model Architecture: **Architecture Type:** Transformer
**Network Architecture:** Decoder-only
## Input: **Input Type(s):** Text (Prompt)
**Input Format(s):** String
**Input Parameters:** One Dimensional (1D)
**Other Properties Related to Input:** The model has a maximum of 131,072 input tokens.
## Output: **Output Type(s):** Text (Response)
**Output Format:** String
**Output Parameters:** 1D
**Other Properties Related to Output:** The model has a maximum of 131,072 input tokens. Maximum output for both versions can be set apart from input.
## Prompt Format: We recommend using the following prompt template, which was used to fine-tune the model. The model may not perform optimally without it. ``` ~~System {system prompt}~~ ~~User {user prompt}~~ Assistant\n ``` - Note that a newline character (\n) should be added after `Assistant` as a generation prompt ## Evaluation Results | Category | Benchmark | # Shots | Mistral-NeMo-Minitron-8B-128K-Instruct | |:----------------------|:----------------------|--------:|---------------------------------------:| | General | MMLU | 5 | 58.9 | | | MT Bench (GPT4-Turbo) | 0 | 6.69 | | Math | GMS8K | 0 | 75.4 | | Reasoning | GPQA (Main) | 0 | 30.4 | | | MUSR | 0 | 35.7 | | Code | HumanEval | 0 | 45.7 | | | MBPP | 0 | 57.5 | | Instruction Following | IFEval | 0 | 74.0 | | Tool Use | BFCL v2 Live | 0 | 59.1 | ## Software Integration: (Cloud) **Runtime Engine:** NeMo Framework 24.09
**Supported Hardware Microarchitecture Compatibility:**
* [NVIDIA Ampere]
* [NVIDIA Blackwell]
* [NVIDIA Hopper]
* [NVIDIA Lovelace]
**[Preferred/Supported] Operating System(s):**
* Linux
### Model Version(s) Mistral-NeMo-Minitron 2B-128K Instruct # Training & Evaluation: ## Training Dataset: ** Data Collection Method by dataset
* Hybrid: Automated, Human
** Labeling Method by dataset
* Hybrid: Automated, Human
## Evaluation Dataset: ** Data Collection Method by dataset
* Hybrid: Automated, Human
** Labeling Method by dataset
* Human
## Inference: **Engine:** TRT-LLM
**Test Hardware:**
* A100
* A10G
* H100
* L40S
**Supported Hardware Platform(s):** L40S, A10G, A100, H100
## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ [Explainability](./explainability.md), [Bias](./bias.md), [Safety & Security](./safety.md), and [Privacy Subcards](./privacy.md). Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).