Triton Inference Server

NVIDIA

Container

NVIDIA

Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

NVIDIA AI Enterprise Supported

Layer	Label		Created
sha256:06a44e76eec1a6a7b466ca514b09c658109f392f2e1986a9cacda452549235ad	COPY	`/opt/hpcx/ompi /opt/hpcx/ompi`	07/28/2025 5:47 PM UTC
sha256:f993fb1847175937f3fc2ccfb85504683e937a3ec7da23443b5cd13ac9812563	RUN	`TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 rm -fr /opt/hpcx/ompi`	07/28/2025 5:47 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	ENV	`LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64`	07/28/2025 5:47 PM UTC
sha256:c3d7f0b4c05a28807c42d19e8b7dce00619534ce69c8de88871e5dfaf90e1688	RUN	TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 ldconfig && ARCH="$(uname -i)" && rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data && rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python && rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples && pip3 install --no-cache-dir transformers && find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf && find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf && pip3 install --no-cache-dir grpcio-tools==1.64.0 && pip3 uninstall -y setuptools	07/28/2025 5:47 PM UTC
sha256:56008764eca91bf979133b8b663a95f537369b999e4b800905b490f1c13a8bd5	COPY	`--chown=1000:1000 docker/sagemaker/serve /usr/bin/.`	07/28/2025 5:47 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.multi-models=true`	07/28/2025 5:47 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true`	07/28/2025 5:47 PM UTC
sha256:839231ef89c35b2e0219fd7b0768a3889e01e63ce4daf3e4690e0289cf028dd5	RUN	`TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 pip3 install -r python/openai/requirements.txt`	07/28/2025 5:47 PM UTC
sha256:eea5c9b4df8d15644f783d1315d23dc93b177b0f258c90cf284287a097aa1d6e	RUN	`TRITON_VERSION=2.59.1 TRITON_CONTAINER_VERSION=25.07 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-.whl" \| xargs -I {} pip install --upgrade {}[all] && find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-.whl" \| xargs -I {} pip install --upgrade {}[all]`	07/28/2025 5:47 PM UTC
sha256:804a664f67ecc68b6ccbed01517b783528017aa34672ea3fffc12af0f1c2fa98	COPY	`--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .`	07/28/2025 5:46 PM UTC