Nemotron-4-340B-Reward | NVIDIA NGC

NVIDIA

Nemotron-4-340B-Reward

Model

NVIDIA

Nemotron-4-340B-Reward

Nemotron-4-340B-Reward

Field:	Response:
Intended Application(s) & Domain(s):	Large Language Model Development
Model Type:	Generative Pre-Trained Transformer (GPT)
Intended Users:	This model is intended for developers and researchers building LLMs.
Output:	Scalar Values (List of 9 Floats)
Describe how the model works:	The network architecture of this model is Nemotron-4 Reward. The decoder in Nemotron-4 generates a list of 5 floating point numbers associated with the 5 HelpSteer2 Dataset (Helpfulness, Correctness, Coherence, Complexity, Verbosity) based on the end-of-response token from the final layer of the model.
Technical Limitations:	The model was trained on English preference data and has not been tested on non-English use-cases.
Verified to have met prescribed quality standards?	Yes
Performance Metrics:	Accuracy, Throughput, and Latency
Potential Known Risks:	The Model may produce scores that are biased or incorrect based on the provided prompt and the user’s preferences.
End User License Agreement:	Please see detailed model cards.