NVIDIA
NVIDIA
TensorRT LLM Release
Container
NVIDIA
NVIDIA
TensorRT LLM Release

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

LayerLabelCreated
a3116e797afb946efe338815049fe66f960c880ae8474dc6b7fb4cedca40595cCONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /app/tensorrt_llm; ExposedPorts 6006/tcp, 8888/tcp
06/08/2026 5:18 PM UTC
c0c32320cdd3ccb44477e273bdc8c82485d5ab9f0d2c24fec086355b2c03d209ENV
TRT_LLM_GIT_COMMIT=15d06c0923b63ac1781784d5f59e1747bb47d5f1 TRT_LLM_VERSION=1.3.0rc18
06/08/2026 5:18 PM UTC
bf1c63c605213b1c5e1ecc3e5df0484f9797b8cf7efb9c512282b51a300e0587ARG
TARGETARCH=amd64
06/08/2026 5:18 PM UTC
ad021102f623236566d31cc26728c0f0638cad6466eff28c712cbdecc341757aARG
TRT_LLM_VER=1.3.0rc18
06/08/2026 5:18 PM UTC
45e42307c580e4c5fa4a76ddf518531c88fa6895c3f57be19aef9c095a789565ARG
GIT_COMMIT=15d06c0923b63ac1781784d5f59e1747bb47d5f1
06/08/2026 5:18 PM UTC
6ba0ee6383311a500b6f251acd06d39309e68a04c9338d0075f2ef95d0eee14aRUN
/bin/bash -c cp /mnt/ctx/README.md ./ &&
  cp -r /mnt/ctx/docs ./docs &&
  cp -r /mnt/ctx/include ./include &&
  cp -r /mnt/ctx/examples ./examples &&
  chmod -R a+w examples &&
  cp /mnt/wheel/tensorrt_llm*.whl ./ &&
  cp -r /mnt/benchmarks ./benchmarks &&
  mkdir -p benchmarks/cpp &&
  cp /mnt/cpp_benchmarks/bertBenchmark /mnt/cpp_benchmarks/gptManagerBenchmark /mnt/cpp_benchmarks/disaggServerBenchmark benchmarks/cpp/ &&
  rm -v benchmarks/cpp/bertBenchmark.cpp benchmarks/cpp/gptManagerBenchmark.cpp benchmarks/cpp/disaggServerBenchmark.cpp benchmarks/cpp/CMakeLists.txt &&
  ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/bin")') bin &&
  test -f bin/executorWorker &&
  ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/libs")') lib &&
  test -f lib/libnvinfer_plugin_tensorrt_llm.so &&
  echo "/app/tensorrt_llm/lib" > /etc/ld.so.conf.d/tensorrt_llm.conf &&
  ldconfig &&
  ! ( ldd -v bin/executorWorker | grep tensorrt_llm | grep -q "not found" ) &&
  rm -rf /root/.cache/uv/archive-v0 &&
  rm -rf /usr/local/lib/python3.12/dist-packages/setuptools/_vendor/jaraco.context-5.3.0.dist-info &&
  rm -rf /usr/local/lib/python3.12/dist-packages/setuptools/_vendor/wheel-0.45.1.dist-info
06/08/2026 5:18 PM UTC
2034553d12f71b7d0465add0abe3eac89c2ec4a9805931f6f49d7c558ea1a54aRUN
/bin/bash -c pip install /tmp/wheel/tensorrt_llm*.whl
06/08/2026 5:18 PM UTC
1f6a84c8ee94c6bdcac884df2646d8975c7c59242343e03732ef1fca13d03a1eWORKDIR
/app/tensorrt_llm
06/08/2026 5:07 PM UTC
b8dc7fa760a4153cc8496eb306c772bb4366add7aa078f8b3d5a68711ad55421RUN
SH_ENV=/etc/shinit_v2 BASH_ENV=/etc/bash.bashrc GITHUB_MIRROR=https://urm.nvidia.com/artifactory/github-go-remote PYTHON_VERSION=3.12.3 TRT_VER= CUDA_VER= CUDNN_VER= NCCL_VER= CUBLAS_VER= TORCH_INSTALL_TYPE=skip TRT_LLM_VER=1.3.0rc18 TARGETARCH=amd64 /bin/bash -c bash /tmp/gen_attribution.sh "devel" "${TRT_LLM_VER}" "${TARGETARCH}"
06/08/2026 5:03 PM UTC
c485f0a8f1d6b3474cd69fbfc31e26facbcaa216c30204d2724eb61117b2c426ARG
TARGETARCH=amd64
06/08/2026 5:03 PM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.