NVIDIA
NVIDIA
TensorRT LLM Release
Container
NVIDIA
NVIDIA
TensorRT LLM Release

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

Description

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

TensorRT LLM images carrying rc tags, such as 1.3.0rc2, are classified as Pre-Release candidates.

Overview

TensorRT LLM Release Container

The TensorRT LLM Release container provides a pre-built environment for running TensorRT LLM.

Visit the official GitHub repository for more details.

Running TensorRT LLM Using Docker

A typical command to launch the container is:

docker run --rm -it --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --gpus=all \
    		nvcr.io/nvidia/tensorrt-llm/release:x.y.z

where x.y.z is the version of the TensorRT LLM container to use (cf. release history on GitHub and tags in NGC Catalog). To sanity check, run the following command:

python3 -c "import tensorrt_llm"

This command will print the TensorRT LLM version if everything is working correctly. After verification, you can explore and try the example scripts included in /app/tensorrt_llm/examples.

Alternatively, if you have already cloned the TensorRT LLM repository, you can use the following convenient command to run the container:

make -C docker ngc-release_run LOCAL_USER=1 DOCKER_PULL=1 IMAGE_TAG=x.y.z

This command pulls the specified container from the NVIDIA NGC registry, sets up the local user's account within the container, and launches it with full GPU support.

For comprehensive information about TensorRT LLM, including documentation, source code, examples, and installation guidelines, visit the following official resources:

Security CVEs

To review known CVEs on this image, refer to the Security Scanning tab on this page.

License

By pulling and using the container, you accept the terms and conditions of this End User License Agreement and Product-Specific Terms.

Publisher
NVIDIA
NVIDIA
Latest Tag1.3.0rc19
UpdatedJune 23, 2026 UTC
Compressed Size20.57 GB
Multinode SupportNo
Multi-Arch SupportYes

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.