NVIDIA
NVIDIA
Nemotron-4-340B-Reward
Model
NVIDIA
NVIDIA
Nemotron-4-340B-Reward

Nemotron-4-340B-Reward

This model is backed by NVIDIA's Plus Plus (++) Promise
to learn more about the quality of the datasets used to train this model.
Field:Response:
Intended Application(s) & Domain(s):Large Language Model Development
Model Type:Generative Pre-Trained Transformer (GPT)
Intended Users:This model is intended for developers and researchers building LLMs.
Output:Scalar Values (List of 9 Floats)
Describe how the model works:The network architecture of this model is Nemotron-4 Reward. The decoder in Nemotron-4 generates a list of 5 floating point numbers associated with the 5 HelpSteer2 Dataset (Helpfulness, Correctness, Coherence, Complexity, Verbosity) based on the end-of-response token from the final layer of the model.
Technical Limitations:The model was trained on English preference data and has not been tested on non-English use-cases.
Verified to have met prescribed quality standards?Yes
Performance Metrics:Accuracy, Throughput, and Latency
Potential Known Risks:The Model may produce scores that are biased or incorrect based on the provided prompt and the user’s preferences.
End User License Agreement:Please see detailed model cards.