NeMo NIM Proxy microservice provides a unified access point for all NVIDIA NIM (NVIDIA Inference Microservice) deployments within your Kubernetes cluster through a single OpenAI-compatible API.
You can use the NIM Proxy microservice to interact with multiple NIM microservices through a unified endpoint proxy API. With this microservice, you can retrieve the deployed NIM microservices and make inference requests to the chat/completions
and completions
APIs.
You can install NeMo NIM Proxy as part of the NeMo microservices platform by using the NeMo Microservices Helm Chart (chart | documentation).
Container | Helm Installation Guide | User Guide
Note: Use, distribution or deployment of this microservice in production requires an NVIDIA AI Enterprise License.
The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.