NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
edcb0bb4528e4b00e9e204c946fe6d2787d92d74330f5493d59415b77f51cc96CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
08/20/2025 4:01 AM UTC
9b1f5a89c8456712bb529bfe43ae01311266e10771c9447ab8724fab838717b7RUN
TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 rm -fr /opt/hpcx/ompi
08/20/2025 4:01 AM UTC
c0daba190eda617936957ee7648faed1c189e4df5a1ad45e3a1e4a437df4c4b8ENV
LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
08/20/2025 4:01 AM UTC
2333c855e0393aaa3c9e18fac971270165943a95d15d9bac258197c5765fb29aRUN
TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
08/20/2025 4:01 AM UTC
e4a891c9746ebba2001321f370b447001fa06c9994ac203cd72dd188a90644a9COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
08/20/2025 4:01 AM UTC
b0ad18be91efa6a07e45c89c6767682f98bb9bcef79bb1fca8fc8e473dde370fLABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
08/20/2025 4:01 AM UTC
0d8755b3fe86383002c64930efc049f1034061c71a73425dacb75dc1b20ffc28LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
08/20/2025 4:01 AM UTC
b3fd2a4f07259a12a92a27a61b850e6a6e8646c5094a4fe99ed17ace07829a05RUN
TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 pip3 install -r python/openai/requirements.txt
08/20/2025 4:01 AM UTC
b6cbc1d3f444bc4ee4a635c7c5615dc54ec9687b9a426251d0d5fffe807a8641RUN
TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
08/20/2025 4:01 AM UTC
bafc3203e06f50e5bba3deb1c2a38cb09122a0955bc4494214698ea9f61d9c5eCOPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
08/20/2025 4:01 AM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.