NVIDIA
NVIDIA
NeMo Evaluator
Container
NVIDIA
NVIDIA
NeMo Evaluator

Model evaluation service for NeMo microservices

Sign in to access all content for this ContainerSigning in will also allow download accessSign In

NeMo Evaluator Microservice Container

NeMo Evaluator microservice provides a comprehensive solution for evaluating large language models (LLMs) as part of the NeMo Microservices ecosystem. It enables systematic assessment of LLM capabilities through academic benchmarks, custom evaluations, and LLM-as-judge techniques.

You can use the Evaluator to test model performance across various dimensions, compare different models against consistent metrics, and conduct evaluations with your own custom datasets to ensure models meet your specific requirements before deployment.

Resources

Helm Chart | User Guide

Note: Use, distribution or deployment of this microservice in production requires an NVIDIA AI Enterprise License.

Governing Terms

The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.

Publisher
NVIDIA
NVIDIA
Latest Tag25.12
UpdatedDecember 16, 2025 UTC
Compressed Size271.72 MB
Multinode SupportNo
Multi-Arch SupportYes

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.