Triton Inference Server

NVIDIA

Container

NVIDIA

Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

NVIDIA AI Enterprise Supported

Layer	Label		Created
sha256:adc8421e316cc666700959addd41d6ce30ded9bb57a875d7c2b11670a7745509	COPY	`/opt/hpcx/ompi /opt/hpcx/ompi`	10/03/2025 5:46 PM UTC
sha256:ec06cc71cccb9e67555e6bfd66897d59d52b3c6f5223ff5c213db510f738e18e	RUN	`TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 rm -fr /opt/hpcx/ompi`	10/03/2025 5:46 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	ENV	`LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64`	10/03/2025 5:46 PM UTC
sha256:5894ee27fd8e113311554d8ed81713a9fdabeca434f9bbe65ae8b69c8eb5de39	RUN	TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 ldconfig && ARCH="$(uname -i)" && rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data && rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python && rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples && pip3 install --no-cache-dir transformers && find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf && find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf && pip3 install --no-cache-dir grpcio-tools==1.64.0 && pip3 uninstall -y setuptools	10/03/2025 5:46 PM UTC
sha256:8f506330c6b995e4bdaaefd006215d854138b0909eae419025f31995d13467a7	COPY	`--chown=1000:1000 docker/sagemaker/serve /usr/bin/.`	10/03/2025 5:46 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.multi-models=true`	10/03/2025 5:46 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true`	10/03/2025 5:46 PM UTC
sha256:703712caa41fe543d47d51bd0e74eef69b9e34a3b85bf838d285de79c20c7525	RUN	`TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 pip3 install -r python/openai/requirements.txt`	10/03/2025 5:46 PM UTC
sha256:c4589459c32a1a291184c517119fcce803ccc5cbd7ce3c64ff53290d9b31c387	RUN	`TRITON_VERSION=2.61.0 TRITON_CONTAINER_VERSION=25.09 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-.whl" \| xargs -I {} pip install --upgrade {}[all] && find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-.whl" \| xargs -I {} pip install --upgrade {}[all]`	10/03/2025 5:46 PM UTC
sha256:9b9172fc5183dfbc6de70bd22ecae584b85e98fab187f3d3fa80c2e4c42d42a8	COPY	`--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .`	10/03/2025 5:46 PM UTC