NVIDIA
NVIDIA
Nemotron-4-340B-Reward
Model
NVIDIA
NVIDIA
Nemotron-4-340B-Reward

Nemotron-4-340B-Reward

This model is backed by NVIDIA's Plus Plus (++) Promise
to learn more about the quality of the datasets used to train this model.
Field:Response:
Intended Application(s) & Domain(s):Large Language Model Development
Model Type:Generative Pre-Trained Transformer (GPT)
Intended Users:This model is intended for developers and researchers building LLMs.
Output:Scalar Values (List of 9 Floats)
Describe how the model works:The network architecture of this model is Nemotron-4 Reward. The decoder in Nemotron-4 generates a list of 5 floating point numbers associated with the 5 HelpSteer2 Dataset (Helpfulness, Correctness, Coherence, Complexity, Verbosity) based on the end-of-response token from the final layer of the model.
Technical Limitations:The model was trained on English preference data and has not been tested on non-English use-cases.
Verified to have met prescribed quality standards?Yes
Performance Metrics:Accuracy, Throughput, and Latency
Potential Known Risks:The Model may produce scores that are biased or incorrect based on the provided prompt and the user’s preferences.
End User License Agreement:Please see detailed model cards.

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.