Linux / arm64
Linux / amd64
Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Triton supports an HTTP/REST and GRPC protocol that allows remote clients to request inferencing for any model being managed by the server. For edge deployments, Triton is available as a shared library with a C API that allows the full functionality of Triton to be included directly in an application.
Three Docker images are available:
The Triton Inference Server Production Branch, exclusively available with NVIDIA AI Enterprise, is a 9-month supported, API-stable branch that includes monthly fixes for high and critical software vulnerabilities. This branch provides a stable and secure environment for building your mission-critical AI applications. The Triton Inference Server production branch releases every six months with a three-month overlap in between two releases.
Before you start, ensure that your environment is set up by following one of the deployment guides available in the NVIDIA AI Enterprise Documentation.
For an overview of the features included in the Triton Inference Server Production Branch October, please refer to the Release Notes for Triton Inference Server 24.08.
For more information about the Triton Inference Server, see:
Additionally, if you're looking for information on Docker containers and guidance on running a container, review the Containers For Deep Learning Frameworks User Guide.
For the optimized performance, it is highly recommended to deploy the supported NVIDIA AI Enterprise Infrastructure software in conjunction with your AI software.
Production Branch - October 2024 (24h2) is compatible with NVIDIA AI Enterprise Infrastructure 5.
OSS License Archive contains all project-related licenses. It ensures transparency and compliance with legal requirements, providing detailed information about the terms and conditions associated with the use, modification, and distribution of this project.
Please review the Security Scanning tab to view the latest security scan results.
For certain open-source vulnerabilities listed in the scan results, NVIDIA provides a response in the form of a Vulnerability Exploitability eXchange (VEX) document. The VEX information can be reviewed and downloaded from the Security Scanning tab.
Get access to knowledge base articles and support cases or submit a ticket.
Visit the NVIDIA AI Enterprise Documentation Hub for release documentation, deployment guides and more.
Go to the NVIDIA Licensing Portal to manage your software licenses. licensing portal for your products. Get Your Licenses