NGC Catalog
CLASSIC
Welcome Guest
Containers
Dynamo Tensorrt-LLM gpt-oss

Dynamo Tensorrt-LLM gpt-oss

For copy image paths and more information, please view on a desktop device.
Description
This container image delivers a ready-to-deploy runtime for Dynamo’s distributed inference framework, purpose-built for OpenAI-compatible models (gpt-oss).
Publisher
NVIDIA
Latest Tag
latest
Modified
August 5, 2025
Compressed Size
28.6 GB
Multinode Support
Yes
Multi-Arch Support
Yes
latest (Latest) Security Scan Results

Linux / arm64

Sorry, your browser does not support inline SVG.

Linux / amd64

Sorry, your browser does not support inline SVG.

Overview

The Dynamo gpt-oss TensorRT-LLM runtime container is a specialized, Docker-based environment designed to run NVIDIA Dynamo with the TensorRT-LLM backend, specifically optimized for OpenAI Open Source. This container streamlines high-performance, distributed large language model (LLM) inference by packaging all required dependencies, runtime components, and optimizations. It ensures a consistent, production-ready environment for deploying and serving OpenAI-style models with maximum efficiency on NVIDIA GPUs.

Key Components

  • TensorRT-LLM Backend: TensorRT-LLM Backend: Leverages NVIDIA’s open-source TensorRT-LLM library for state-of-the-art LLM inference optimizations.

  • Dynamo Core Services: Provides HTTP API server, request routing, and distributed worker processes for scalable prefill and decode operations.

  • Distributed Coordination: Integrates with etcd and NATS for robust service discovery and messaging.

  • OpenAI-Compatible API: Exposes endpoints matching OpenAI’s API, enabling seamless integration with existing OpenAI-compatible clients and tools.

  • gpt-oss Model Support: Ensures compatibility and optimized performance.

For more information about Dynamo features, please refer to the Github repository

Getting Started

  • Select the Tags tab and locate the container image release that you want to run.

  • In the Pull Tag column, click the icon to copy the docker pull command.

  • Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.

  • Start required services (etcd and NATS) using Docker Compose:

    docker compose -f deploy/docker-compose.yml up -d

  • Run the container image and verify dynamo via:

    dynamo --version

For more examples, please refer to the examples directory in the repository.

Support Matrix

Please refer to the following support matrix to learn more about the current hardware & architecture support.

License

NVIDIA Dynamo is released under an open-source license, Apache-2.0, making it freely available for development, research, and deployment.

Technical Support

GitHub Issues: Dynamo GitHub Issues