NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
7c24df5f03fcde2a4f420616016dafc43e14fd26a100d847fcf62507539b55e9CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
12/19/2025 1:08 AM UTC
9c31954ef96dbeecbc06511fb99a353c107ae3f5f9107f30fd6f7706ff43e839RUN
TRITON_VERSION=2.64.0 TRITON_CONTAINER_VERSION=25.12 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
12/19/2025 1:08 AM UTC
63e56f1c0a8f93510efeb07b98d24919fb45967c88a73162ec4e2c28051ded2eCOPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
12/19/2025 1:08 AM UTC
3aad6212c1f9bce34457d1091312555e303f4e0c2989fe5f23dfac05a624fa98LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
12/19/2025 1:08 AM UTC
98debd98d3d5a8f9a770b08139d3ded6bedf75bbf4763603fb8e98cbd9023449LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
12/19/2025 1:08 AM UTC
3d7692580fb821b3e1bcebd1ff02bd8f21c087a30e90aceb8d0f152fa22cf54cRUN
TRITON_VERSION=2.64.0 TRITON_CONTAINER_VERSION=25.12 pip3 install -r python/openai/requirements.txt
12/19/2025 1:08 AM UTC
ba222995f0c36742a264565dab5fdc76ea071c252a169741096f95e9e58f6964RUN
TRITON_VERSION=2.64.0 TRITON_CONTAINER_VERSION=25.12 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
12/19/2025 1:08 AM UTC
011bcb8cf4c9e8048951141c5e19347f4140f388c6dce101462949b69183b646COPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
12/19/2025 1:08 AM UTC
7673a74e850d9ebae4d7bfc72475af02cc823a9cc4f4b8f3c37de70242db41b6WORKDIR
/opt/tritonserver
12/19/2025 1:08 AM UTC
29bdf4bfbfdeb72f4a654a1c9a9f5b293e88ba9c728c45ce00708cb9ee8c61c9COPY
--chown=1000:1000 build/install tritonserver
12/19/2025 1:08 AM UTC
...