Triton Inference Server

NVIDIA

Container

NVIDIA

Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

NVIDIA AI Enterprise Supported

Layer	Label		Created
sha256:603a1be17b8cdc82428a8ae8bcd579321c57a59235c9740d7ed3450283251706	COPY	`/opt/hpcx/ompi /opt/hpcx/ompi`	08/20/2025 4:01 AM UTC
sha256:923e7ab0f1fe2b835e7f418a052f0c02daf07545c2599505662a90248b233d93	RUN	`TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 rm -fr /opt/hpcx/ompi`	08/20/2025 4:01 AM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	ENV	`LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64`	08/20/2025 4:01 AM UTC
sha256:ff7ba96cff5f1f08b41e6d925a1f172ae0e7a6c5bfa38a46bc79d6d29d191a01	RUN	TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 ldconfig && ARCH="$(uname -i)" && rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data && rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python && rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples && pip3 install --no-cache-dir transformers && find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf && find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf && pip3 install --no-cache-dir grpcio-tools==1.64.0 && pip3 uninstall -y setuptools	08/20/2025 4:01 AM UTC
sha256:6549d60b078cc0626c8ba52235522f68290a8be22e655240a583e11983e54e87	COPY	`--chown=1000:1000 docker/sagemaker/serve /usr/bin/.`	08/20/2025 4:01 AM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.multi-models=true`	08/20/2025 4:01 AM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true`	08/20/2025 4:01 AM UTC
sha256:2e2046004384d9f31690c419b0f3404df2e9106d442c197e447a25ea4864cad7	RUN	`TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 pip3 install -r python/openai/requirements.txt`	08/20/2025 4:01 AM UTC
sha256:ba40061d239342dcfc1b7e7426caa7cec6483ab13297529555f0e8f84bd030a8	RUN	`TRITON_VERSION=2.60.0 TRITON_CONTAINER_VERSION=25.08 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-.whl" \| xargs -I {} pip install --upgrade {}[all] && find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-.whl" \| xargs -I {} pip install --upgrade {}[all]`	08/20/2025 4:01 AM UTC
sha256:905acadf144a9b1e61348ceb44339f1d78b39e75da4b9cc3e762c0c212714cd3	COPY	`--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .`	08/20/2025 4:01 AM UTC