Linux / amd64
NeMo NIM Proxy microservice provides a unified access point for all NVIDIA NIM (NVIDIA Inference Microservice) deployments within your Kubernetes cluster through a single OpenAI-compatible API.
You can use the NIM Proxy to interact with multiple deployed models through standardized endpoints for retrieving model lists and making inference requests to chat completions and completions APIs.
Note: Use, distribution or deployment of this microservice in production requires an NVIDIA AI Enterprise License.
The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products.