NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
6a5103ef3889b713bf14003fab7329937fbd084b06d9b08acbef20bcab91680fCONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
10/29/2025 9:28 PM UTC
0d1f870417fb102f1050825a9cb3a724eb8103c3d64f22a5d710736994451566RUN
TRITON_VERSION=2.62.0 TRITON_CONTAINER_VERSION=25.10 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
10/29/2025 9:28 PM UTC
1062af48b9df3ef07e77ff7073e38b733397eb983371d157ee1daae936e1a533COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
10/29/2025 9:27 PM UTC
8f5b409f390eb6ff5c3814df48a096fd054306cc1b9e6d4923c89b4866f42bd4LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
10/29/2025 9:27 PM UTC
e22c0546523593b476389e3a75f46fd71b22c98d8ac35ef4770cd432aaf21ec1LABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
10/29/2025 9:27 PM UTC
6312675c14a5000adab7f2ca1d859bcee8cd70c67b07fbd5f3e0b5f49aaa2141RUN
TRITON_VERSION=2.62.0 TRITON_CONTAINER_VERSION=25.10 pip3 install -r python/openai/requirements.txt
10/29/2025 9:27 PM UTC
5a54c65ff8c05ec4c90245c715109f86fab16841698f0fcde26ee89514a5b9faRUN
TRITON_VERSION=2.62.0 TRITON_CONTAINER_VERSION=25.10 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
10/29/2025 9:27 PM UTC
ed29935cdf2e01ea7351885e311ac00be3ddec46391a0e49953461ae14963b3bCOPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
10/29/2025 9:27 PM UTC
e14a1f23fbec642fb16dc3e5c9571212b2c69cd6b73cbb70f7c1c3b9bea4d2c7WORKDIR
/opt/tritonserver
10/29/2025 9:27 PM UTC
b00feb694c6e68e363adf8837c9a05382dd883cc0d53965f2841e16d64ca4af0COPY
--chown=1000:1000 build/install tritonserver
10/29/2025 9:27 PM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.