NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
9a693eacd414869e76a1d9a4ac92763be13c6ca5fc1d0afcd6a03b37f27c81b2CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
06/18/2025 10:28 PM UTC
1509a97210175006dba0990a3a295a794c12d63c17c3daa7a83f2e696c4f79ceRUN
TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 rm -fr /opt/hpcx/ompi
06/18/2025 10:28 PM UTC
e54b426a7b014eafb9c9941299575bfc178ec7074ead944bcda5e05dc341bec9ENV
LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
06/18/2025 10:28 PM UTC
a9b77c924c62eeed932d687ed32b913c7289adf88f34b289b547c770c2035609RUN
TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
06/18/2025 10:28 PM UTC
bd7c5d6bc62156abe17751b45f6aa6fd55b6b06255636fd52cef8d1d80ef0b24COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
06/18/2025 10:28 PM UTC
d2b56baced5b78c129e439c3239561d7f4e9c1532c159d8a5c7e64361daa5da1LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
06/18/2025 10:28 PM UTC
c3f06454706f693120bfa89e42b691f1811d2477bbe7a12984701989928b0e32LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
06/18/2025 10:28 PM UTC
164eb2d821b61a926f9dde36f20d4847d670234967ce498a70e1c8b5c63b6a25RUN
TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 pip3 install -r python/openai/requirements.txt
06/18/2025 10:28 PM UTC
0e9e4101c4a2cf0a048fca9c02e9da5c00364f648a2b4699daa1edfb6b664d64RUN
TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
06/18/2025 10:28 PM UTC
5c7c9c240f52181f08b3bbafd46a55d94a9acf70680324e2b36ff0998cd63f00COPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
06/18/2025 10:28 PM UTC
...