NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
2b708af25c0d7341e03519c1def7ae7e76b8d655198adc6e60392967417b3208CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
11/20/2025 9:24 PM UTC
9cfa0a0803103f347a8a568f86f63da18f1c10c356d61b4c91a4c9e78c7e914bRUN
TRITON_VERSION=2.63.0 TRITON_CONTAINER_VERSION=25.11 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
11/20/2025 9:24 PM UTC
f08adf012e2571a97fd4b5148158d8331e57baecc94b3b1f5dd1cacdc83e7d15COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
11/20/2025 9:24 PM UTC
03bbf2c5affc1469226193685f22984c4c8e80275b3c5f77983bfa417119be28LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
11/20/2025 9:24 PM UTC
10b7fc6ef507588b92d13aac11d6f8ba44111e251fd63b4dd1ebe261f2b3fab2LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
11/20/2025 9:24 PM UTC
ec3ea2198ef03ee1d50726476c214a2d6d398160ba9e1600b41e0234755de683RUN
TRITON_VERSION=2.63.0 TRITON_CONTAINER_VERSION=25.11 pip3 install -r python/openai/requirements.txt
11/20/2025 9:24 PM UTC
67c2cf087ad2aa0ebb37d4e63f2bb5863f2fadd5db58154b00e8df11da7772a1RUN
TRITON_VERSION=2.63.0 TRITON_CONTAINER_VERSION=25.11 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
11/20/2025 9:24 PM UTC
f0a0cd6e85b782d6cc745a2e95db200f87718ec392314d0cf6af0f2726231c82COPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
11/20/2025 9:24 PM UTC
ae74a82c28ab06908f902c8dadc9244fa09f5f1771d8bed8efb4fec5b9c0c8c3WORKDIR
/opt/tritonserver
11/20/2025 9:24 PM UTC
0859b5feddfff982d7840ddfb012e463bfd7bc54e6356cdd8b6aaa980fb30110COPY
--chown=1000:1000 build/install tritonserver
11/20/2025 9:24 PM UTC
...