NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
651de2b8e73ac6858694316bf27f61308f8f8268791ca96f21ba0f8d9e2a85e3CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
07/28/2025 5:47 PM UTC
f0efcaa37de8b4402933fd7bf9d45913e965dc3cacb034d637f911a6d9d9202aRUN
TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 rm -fr /opt/hpcx/ompi
07/28/2025 5:47 PM UTC
24990de137048e990c520cf20da345f5ff5230b2afb1391339030bf7060b0465ENV
LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
07/28/2025 5:47 PM UTC
ae1995cc700e6784e18a6f53088a8a5971b9243fcce6a0174048ad5ee6dfc128RUN
TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
07/28/2025 5:47 PM UTC
efce8a1cfaff52750f261398687c97ce2deb737b4684f780095348958f99e964COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
07/28/2025 5:47 PM UTC
4bf7208456e0cf9a6deac654c9693b80cdfc88f01344a31ad50162c37b4baadbLABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
07/28/2025 5:47 PM UTC
9b76d81c5ad2187a8ae3f8a86c0b170b3123b9dc50dba74c5481a64011499d5eLABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
07/28/2025 5:47 PM UTC
f6938e7bb009b789009ede704769f91e1c137d3e64242506571a9b6a5c3570a3RUN
TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 pip3 install -r python/openai/requirements.txt
07/28/2025 5:47 PM UTC
5930afef00806f66a7563b93826b1757bd35607117f1e0c00eeeb538a934cec2RUN
TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
07/28/2025 5:47 PM UTC
903e3c61601860b37def0f095d9a1e2b067ee4cc21e0b14b97f53724e2458ca1COPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
07/28/2025 5:46 PM UTC
...