NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
26f4ca221ebae4b72e4f60d5ee062bfab07f54de212c1eaaa2b3025f02124cf8CONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
02/13/2026 10:06 PM UTC
d24abd3f2c0a8e9896cf11bc2b696f2e89f9c82873b9bc29c415a4e78d78e07bRUN
TRITON_VERSION=2.66.0 TRITON_CONTAINER_VERSION=26.02 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
02/13/2026 10:06 PM UTC
19c2101dca722addd86e3eb8fdb3b9713c17f3c4133841216ca123f2f2beb261COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
02/13/2026 10:05 PM UTC
53f683da1217282798c70243057741d4c1a04545f6afbcc8ad53721287f8fe79LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
02/13/2026 10:05 PM UTC
bb383ba85eb7a1e053d7849494f3b5752253743b486a3e757c4fcbb1abcc3272LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
02/13/2026 10:05 PM UTC
7e3f8271b7d36f7da084b7265532b5a83345252ca081113a1221a1e32d221aceRUN
TRITON_VERSION=2.66.0 TRITON_CONTAINER_VERSION=26.02 pip3 install -r python/openai/requirements.txt
02/13/2026 10:05 PM UTC
90dac8cadd7b943de60d997eb31397dcb148bf17a08399f44d837fd15a70a7e8RUN
TRITON_VERSION=2.66.0 TRITON_CONTAINER_VERSION=26.02 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
02/13/2026 10:05 PM UTC
68581dec538c0eb9bcb9fcdb5ab07f1a26da9b2eab67f580f815f9d69faa955aCOPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
02/13/2026 10:05 PM UTC
9b8c49ba3859e30983dd1e6f0ad5a4d911fdc07929eb967abe042da57aa9a003WORKDIR
/opt/tritonserver
02/13/2026 10:05 PM UTC
a004ba5d8163835b423b793e6f476ee3766924088b6ce339696413b7455c6091COPY
--chown=1000:1000 build/install tritonserver
02/13/2026 10:05 PM UTC
...