NGC Catalog
CLASSIC
Welcome Guest
Containers
Dynamo Tensorrt-LLM Runtime

Dynamo Tensorrt-LLM Runtime

For copy image paths and more information, please view on a desktop device.
Description
The Dynamo TensorRT-LLM runtime image is a containerized build of Dynamo + TensorRT-LLM which serves as the base runtime environment for tensorrt-llm based inference with Dynamo's distributed inference framework.
Publisher
NVIDIA
Latest Tag
0.3.2
Modified
July 19, 2025
Compressed Size
21.61 GB
Multinode Support
No
Multi-Arch Support
No
0.3.2 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

Overview

The Dynamo TensorRT-LLM (tensorrtllm) runtime container is a pre-built, Docker-based environment designed to run NVIDIA Dynamo with the TensorRT-LLM backend for high-performance, distributed large language model (LLM) inference. It packages all necessary dependencies, runtime components, and optimizations to streamline deployment and ensure consistency across development and production environments.

Key Components

  • TensorRT-LLM Backend: TensorRT-LLM is an open-sourced library for optimizing Large Language Model (LLM) inference. It provides state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

  • Dynamo Core Services: Includes the HTTP API server, request router, and worker processes for prefill and decode phases.

  • Supporting Services: Integrates with etcd and NATS for distributed coordination and messaging.

  • OpenAI-Compatible Frontend: Exposes an HTTP API compatible with OpenAI’s endpoints for easy integration.

For more information about Dynamo features, please refer to the Github repository

Getting Started

  • Select the Tags tab and locate the container image release that you want to run.

  • In the Pull Tag column, click the icon to copy the docker pull command.

  • Open a command prompt and paste the pull command. The pulling of the container image begins. Ensure the pull completes successfully before proceeding to the next step.

  • Start required services (etcd and NATS) using Docker Compose:

    docker compose -f deploy/docker-compose.yml up -d

  • Run the container image and verify dynamo via:

    dynamo --version

For more examples, please refer to the examples directory in the repository.

Support Matrix

Please refer to the following support matrix to learn more about the current hardware & architecture support. Dynamo currently only provides pre-built x86_64 containers.

License

NVIDIA Dynamo is released under an open-source license, Apache-2.0, making it freely available for development, research, and deployment.

Technical Support

GitHub Issues: Dynamo GitHub Issues