Model
Nemotron-4-340B-Reward
Use the NGC CLI to download:
Copied!
| Field: | Response: |
|---|---|
| Intended Application(s) & Domain(s): | Large Language Model Development |
| Model Type: | Generative Pre-Trained Transformer (GPT) |
| Intended Users: | This model is intended for developers and researchers building LLMs. |
| Output: | Scalar Values (List of 9 Floats) |
| Describe how the model works: | The network architecture of this model is Nemotron-4 Reward. The decoder in Nemotron-4 generates a list of 5 floating point numbers associated with the 5 HelpSteer2 Dataset (Helpfulness, Correctness, Coherence, Complexity, Verbosity) based on the end-of-response token from the final layer of the model. |
| Technical Limitations: | The model was trained on English preference data and has not been tested on non-English use-cases. |
| Verified to have met prescribed quality standards? | Yes |
| Performance Metrics: | Accuracy, Throughput, and Latency |
| Potential Known Risks: | The Model may produce scores that are biased or incorrect based on the provided prompt and the user’s preferences. |
| End User License Agreement: | Please see detailed model cards. |