NVIDIA
NVIDIA
TensorRT LLM Release
Container
NVIDIA
NVIDIA
TensorRT LLM Release

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.

LayerLabelCreated
2a2cee645f0764ea3c98d0745a79bbb4b08317bac36206935b12ac14a9e7783cCONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /app/tensorrt_llm; ExposedPorts 6006/tcp, 8888/tcp
06/22/2026 4:08 AM UTC
d2958971f9fca09f21b21374c8dbd9ffa867004d8f860ffc2f56330383d9e4e3ENV
TRT_LLM_GIT_COMMIT=a8c595521e306b8fa60ddeaa533152f8052e1ac1 TRT_LLM_VERSION=1.3.0rc19
06/22/2026 4:08 AM UTC
3a4bfce660831b7f5dfa3ecc9249176ae256af1eb234eeea52f25125ef78c6faARG
TARGETARCH=amd64
06/22/2026 4:08 AM UTC
9f35a7dbaed46f24441bb150389a43f76d6029afee938b435baa42a7f72ceafeARG
TRT_LLM_VER=1.3.0rc19
06/22/2026 4:08 AM UTC
b64f791e2717828126a8a5c82e1d66f697b07d17bd402c1503ded5513cfb105bARG
GIT_COMMIT=a8c595521e306b8fa60ddeaa533152f8052e1ac1
06/22/2026 4:08 AM UTC
071ed54281d6dae42c5cafd9c330806e3d19ab615287e3d44d499acbfdae12f4RUN
/bin/bash -c cp /mnt/ctx/README.md ./ &&
  cp -r /mnt/ctx/docs ./docs &&
  cp -r /mnt/ctx/include ./include &&
  cp -r /mnt/ctx/examples ./examples &&
  chmod -R a+w examples &&
  cp /mnt/wheel/tensorrt_llm*.whl ./ &&
  cp -r /mnt/benchmarks ./benchmarks &&
  mkdir -p benchmarks/cpp &&
  cp /mnt/cpp_benchmarks/bertBenchmark /mnt/cpp_benchmarks/gptManagerBenchmark /mnt/cpp_benchmarks/disaggServerBenchmark benchmarks/cpp/ &&
  rm -v benchmarks/cpp/bertBenchmark.cpp benchmarks/cpp/gptManagerBenchmark.cpp benchmarks/cpp/disaggServerBenchmark.cpp benchmarks/cpp/CMakeLists.txt &&
  ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/bin")') bin &&
  test -f bin/executorWorker &&
  ln -sv $(python3 -c 'import site; print(f"{site.getsitepackages()[0]}/tensorrt_llm/libs")') lib &&
  test -f lib/libnvinfer_plugin_tensorrt_llm.so &&
  echo "/app/tensorrt_llm/lib" > /etc/ld.so.conf.d/tensorrt_llm.conf &&
  ldconfig &&
  ! ( ldd -v bin/executorWorker | grep tensorrt_llm | grep -q "not found" ) &&
  rm -rf /root/.cache/uv/archive-v0 &&
  rm -rf /usr/local/lib/python3.12/dist-packages/setuptools/_vendor/jaraco.context-5.3.0.dist-info &&
  rm -rf /usr/local/lib/python3.12/dist-packages/setuptools/_vendor/wheel-0.45.1.dist-info
06/22/2026 4:08 AM UTC
004edb0ca5501c919a6a96229d9d795b64f9560907b1f4df066021d39f341c27RUN
/bin/bash -c pip install /tmp/wheel/tensorrt_llm*.whl
06/22/2026 4:08 AM UTC
81e84be89db7a07870d403a8326d0b6c3b32cc31231072a36bf4e356cacccfe5WORKDIR
/app/tensorrt_llm
06/22/2026 4:01 AM UTC
196c0797b5c5dd43b1ab9f354453adf9b82a65bd74a06c061333b3cc0d0ec874RUN
SH_ENV=/etc/shinit_v2 BASH_ENV=/etc/bash.bashrc GITHUB_MIRROR=https://urm.nvidia.com/artifactory/github-go-remote PYTHON_VERSION=3.12.3 TRT_VER= CUDA_VER= CUDNN_VER= NCCL_VER= CUBLAS_VER= TORCH_INSTALL_TYPE=skip TRT_LLM_VER=1.3.0rc19 TARGETARCH=amd64 /bin/bash -c bash /tmp/gen_attribution.sh "devel" "${TRT_LLM_VER}" "${TARGETARCH}"
06/22/2026 3:57 AM UTC
fb7d89e8fbca4841ac2dd08cd54abfa8f125111f11579b8e69edf4679c47611cARG
TARGETARCH=amd64
06/22/2026 3:57 AM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.