What Is Triton Management Service (TMS)?
Triton Management Service (TMS) is a Kubernetes microservice intended to manage the deployment of AI models on Triton Inference Servers (TIS). The benefit of using TMS over manual or custom deployment solutions comes from TMS in-depth understanding of TIS and GPU hardware, and how they interact with various model frameworks such as PyTorch, TensorFlow, ONNX, and others. TMS strives to balance the deployment of the minimum number of TIS instances with the performance of TIS served AI models.
Getting started with TMS GRPC API Bundle
TMS GRPC API Bundle is exclusively available with NVIDIA AI Enterprise.
Before you start, ensure that your environment is set up by following one of the deployment guides available in the NVIDIA AI Enterprise Documentation.
This GRPC IDL bundle provides the Protocol Buffer definition files that comprise the TMS API. Use the provided files to generate a library for your favorite development language and integrate TMS directly into your applications.
Read our TMS GRPC API package documentation for additional details and instructions on how to get started creating custom client software for TMS.
Compatible Infrastructure Software Versions
For optimal performance, deploy the supported NVIDIA AI Enterprise Infrastructure software with Triton Management Service (TMS).
The latest version of Triton Management Service (TMS) is compatible with:
Get access to knowledge base articles and support cases or submit a ticket.
NVIDIA AI Enterprise Documentation
Visit the NVIDIA AI Enterprise Documentation Hub for release documentation, deployment guides and more.