NVIDIA
NVIDIA
Dynamo Tensorrt-LLM Runtime
Container
NVIDIA
NVIDIA
Dynamo Tensorrt-LLM Runtime

The Dynamo TensorRT-LLM runtime image is a containerized build of Dynamo + TensorRT-LLM which serves as the base runtime environment for tensorrt-llm based inference with Dynamo's distributed inference framework.

NVIDIA Dynamo
NVIDIA Dynamo is a high-throughput low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments.