Container
NVIDIA NeMo Evaluator-compatible container with Tau2-Bench support
26.03
Signed
This image has a digital signature verifying that it has not been altered or corrupted since its signing.
ScannedNo malware was found in this artifact.
Copy the image path for this tag below:
View all tagsCopied!
NVIDIA NeMo Evaluator
The goal of NVIDIA NeMo Evaluator is to advance and refine state-of-the-art methodologies for model evaluation, and deliver them as modular evaluation packages (evaluation containers and pip wheels) that teams can use as standardized building blocks.
Overview
Tau2-Bench implements a simulation framework for evaluating customer service agents across various domains.
Key Features:
- Comprehensive evaluation across multiple task types
- Standardized benchmarking methodology
- Support for diverse model architectures
Quick Start Guide
List the available evaluations:
NVIDIA NeMo Evaluator provides you with evaluation clients that are specifically built to evaluate model endpoints using our Standard API.
Launching an Evaluation
Run the Evaluation of Your Choice
3rd Party Source Code
Users can download the third party source code through the URL provided in the container's README located in workdir.
Publisher
NVIDIA
Latest Tag26.03
UpdatedMarch 11, 2026 UTC
Compressed Size605.3 MB
Multinode SupportNo
Multi-Arch SupportYes
System