NVIDIA
NVIDIA
Triton Inference Server
Container
NVIDIA
NVIDIA
Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

LayerLabelCreated
dd7851006a49fc3fb101ef4ac1925f81094b39012ba8e72e423f79ffadb09cebCONFIG
Entrypoint /opt/nvidia/nvidia_entrypoint.sh; WorkingDir /opt/tritonserver
10/03/2025 5:46 PM UTC
eeb9187b9c56fd5b632bec6452e99fe8049caf3ffc4f922d169d296762469fbdRUN
TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 rm -fr /opt/hpcx/ompi
10/03/2025 5:46 PM UTC
76aaed778e7b82ba5c3b87576f54b7d7884961860119dd177b41a9d2e9212374ENV
LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
10/03/2025 5:46 PM UTC
ed6d72ab94d73e7d31dbd399797434c88d489723b22fbc2324a1aa577feb2111RUN
TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 ldconfig &&
  ARCH="$(uname -i)" &&
  rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data &&
  rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python &&
  rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples &&
  pip3 install --no-cache-dir transformers &&
  find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf &&
  find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf &&
  pip3 install --no-cache-dir grpcio-tools==1.64.0 &&
  pip3 uninstall -y setuptools
10/03/2025 5:46 PM UTC
bd6e9d5ebc9f31b404499141189fd60d3c4e5673c227c6c986346b978a4f6bf8COPY
--chown=1000:1000 docker/sagemaker/serve /usr/bin/.
10/03/2025 5:46 PM UTC
d59114cd48eaea3fa010c7c9828435f87ea1835adff06fbc37d79545a00a23a0LABEL
com.amazonaws.sagemaker.capabilities.multi-models=true
10/03/2025 5:46 PM UTC
38de19d76f9b061d0c36cb93185c5a23c6d46e52d25fad8b6357fae0d737020dLABEL
com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
10/03/2025 5:46 PM UTC
7499d406d08c68854ed8ddd728e5742373dcfeedeeed1b128736a1b78e6fad81RUN
TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 pip3 install -r python/openai/requirements.txt
10/03/2025 5:46 PM UTC
033128763ef0591f9f41f2ae9b4807b86165260e476cc2d26f0e02bf7aade361RUN
TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-*.whl" | xargs -I {} pip install --upgrade {}[all] &&
  find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-*.whl" | xargs -I {} pip install --upgrade {}[all]
10/03/2025 5:46 PM UTC
33456c3216bc136e2f00762caf2bb354b982c0ffb00aa9aacb6d4dd6d8335dadCOPY
--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .
10/03/2025 5:46 PM UTC
...

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.