Triton Inference Server

NVIDIA

Container

NVIDIA

Triton Inference Server

Triton Inference Server is an open source software that lets teams deploy trained AI models from any framework, from local or cloud storage and on any GPU- or CPU-based infrastructure in the cloud, data center, or embedded devices.

NVIDIA AI Enterprise Supported

Layer	Label		Created
sha256:63bb9ca632ea92209b1ed82bcc7cb43992844141e29da26155a232e797c9ec91	COPY	`/opt/hpcx/ompi /opt/hpcx/ompi`	06/18/2025 10:28 PM UTC
sha256:28e1c3846f65abe8373d6d7f13923618ddd6845a02d1a12751d3506bb3f3210e	RUN	`TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 rm -fr /opt/hpcx/ompi`	06/18/2025 10:28 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	ENV	`LD_LIBRARY_PATH=/usr/local/tensorrt/lib/:/opt/tritonserver/backends/tensorrtllm:/usr/local/tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64`	06/18/2025 10:28 PM UTC
sha256:6adfc89e853ff3e9bab4775621e5372164273dde56879378bcca8282bc81c80f	RUN	TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 ldconfig && ARCH="$(uname -i)" && rm -fr ${TRT_ROOT}/bin ${TRT_ROOT}/targets/${ARCH}-linux-gnu/bin ${TRT_ROOT}/data && rm -fr ${TRT_ROOT}/doc ${TRT_ROOT}/onnx_graphsurgeon ${TRT_ROOT}/python && rm -fr ${TRT_ROOT}/samples ${TRT_ROOT}/targets/${ARCH}-linux-gnu/samples && pip3 install --no-cache-dir transformers && find /usr -name libtensorrt_llm.so -exec dirname {} \; > /etc/ld.so.conf.d/tensorrt-llm.conf && find /opt/tritonserver -name libtritonserver.so -exec dirname {} \; > /etc/ld.so.conf.d/triton-tensorrtllm-worker.conf && pip3 install --no-cache-dir grpcio-tools==1.64.0 && pip3 uninstall -y setuptools	06/18/2025 10:28 PM UTC
sha256:c02a24c97f21949460addb2ca14385d596430041c60e8a97d1bc34114a1f4045	COPY	`--chown=1000:1000 docker/sagemaker/serve /usr/bin/.`	06/18/2025 10:28 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.multi-models=true`	06/18/2025 10:28 PM UTC
sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4	LABEL	`com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true`	06/18/2025 10:28 PM UTC
sha256:44aa70d7c1fcca4bef2e17f7dca267da30fc335269996323cb0e1e04f188f947	RUN	`TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 pip3 install -r python/openai/requirements.txt`	06/18/2025 10:28 PM UTC
sha256:65ea7b945d77812c4f864677fba00c33dda7964077a092631147adb898a69234	RUN	`TRITON_VERSION=2.59.0 TRITON_CONTAINER_VERSION=25.06 find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonserver-.whl" \| xargs -I {} pip install --upgrade {}[all] && find /opt/tritonserver/python -maxdepth 1 -type f -name "tritonfrontend-.whl" \| xargs -I {} pip install --upgrade {}[all]`	06/18/2025 10:28 PM UTC
sha256:8defb1f612df54dc826a6c496c78e563bf67810993055c6e44aa5fd046540436	COPY	`--chown=1000:1000 NVIDIA_Deep_Learning_Container_License.pdf .`	06/18/2025 10:28 PM UTC