NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
0e145e1b2ba4a4defed3b2bd44a21410dc7c0e600df4a532fe593204edf9987bCONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
01/27/2026 11:58 PM UTC
b53c9528dcc91ab958957f0d9c1cdf9e7f6d5801a7d9b13ce7cf6130630a9e80RUN
TRITON_VERSION=2.65.0 TRITON_CONTAINER_VERSION=26.01 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
01/27/2026 11:58 PM UTC
efb872501ed80f9535f37ef7be855007d6f38f6ef676eb73eecdbe7864f1d4ebCOPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
01/27/2026 11:58 PM UTC
18ba5d4c132d1e15aad19b1d84aa7101b38960942fd74c2c989d0d3f4728e0cbLABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
01/27/2026 11:58 PM UTC
1b197e2f30dda216ce8f3fceec8b608a82aef818f28d2fc64827e055084574a5LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
01/27/2026 11:58 PM UTC
5f3da7dfc984e488656777f5d4d540660026cd41d64d9b4849e600f2fba48433RUN
TRITON_VERSION=2.65.0 TRITON_CONTAINER_VERSION=26.01 pip3 install -r python/openai/requirements.txt
01/27/2026 11:58 PM UTC
5cdfb19ee4c66230c2732668c21be050a60a0aae3573fe7c14ee7456455e2f74RUN
TRITON_VERSION=2.65.0 TRITON_CONTAINER_VERSION=26.01 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
01/27/2026 11:57 PM UTC
20ef3359fc614588d00f4cbc273899d23ce9060a6cda480775e4b7940192ca7cCOPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
01/27/2026 11:57 PM UTC
393b36e59dbaa264b7204c39ced59716fba1114fbb14be92cf4ff49cb35ed253WORKDIR
/opt/tritonserver
01/27/2026 11:57 PM UTC
531eb39a187e5271f4cecb38a26bc1b3599915e397953393d23f32067b6b2bc7COPY
--chown=1000:1000 build/install tritonserver
01/27/2026 11:57 PM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.