USD Code API Helm Chart

Welcome to USD Code API Helm Chart!

Homepage: https://docs.omniverse.nvidia.com/services/latest/services/usd-code/overview.html

Overview

This chart assumes that a Kubernetes cluster is already available and configured.

Prerequisite

Generate your NGC Helm and container registry API Key by referring to the onboarding guide provided here.

To ensure a smooth experience, please configure your Kubernetes cluster with the following essential features enabled:

NVIDIA k8s device plugin or NVIDIA GPU-operator: This is a requirement for running internal LLM/embedding models. Please make sure that there is no version mismatch between the driver, CUDA and Fabric Manager if installed.

Please refer to the Kubernetes Setup Documentation or Install a Local Kubernetes with MicroK8s for installation guidance.

Deployment

You can deploy USD Code to the Kubernetes cluster with the following steps.

Set the following environment variables that will be used to fetch the latest helm chart and pull container images from NGC:

export NGC_API_KEY=nvapi-...
export IMAGE_PULL_SECRET=nvcrimagepullsecret
export HELM_RELEASE=usdcode
export HELM_CHART_REPO=https://helm.ngc.nvidia.com/nvidia/omniverse-usdcode/charts/usdcode-1.0.0.tgz

Install the helm chart (add --set llm.resources.limits."nvidia\.com/gpu"=8 if L40S is used, 4 is default):

helm install \
  --username '$oauthtoken' \
  --password $NGC_API_KEY \
  $HELM_RELEASE \
  $HELM_CHART_REPO \
  --set secrets.create.registry=true \
  --set ngcImagePullSecretName=$IMAGE_PULL_SECRET \
  --set secrets.OMNIVERSE_NGC_API_KEY=$NGC_API_KEY \
  -n $HELM_RELEASE \
  --create-namespace

Once the installation is complete, check if all pods are ready as below:

kubectl get pods -n $HELM_RELEASE

There will be 3 pods - main, llm (llama3.1-70b), embedding (nv-embed-e5-v5). The embedding pod will take around 3 mins, the llm pod will take around 15 mins, and the main pod will be ready once it confirms both the llm and the embedding pods are ready. In total, it will take around 15-20 mins.

If you still observe any pods in Pending state after 30 min, please refer to the troubleshooting section in Kubernetes documentation. See examples of troubleshooting for helm/kubernetes documented here.

Forward the main service to localhost:8000 for testing purposes

kubectl port-forward svc/main 8000:8000 -n $HELM_RELEASE

Once test is done, you can uninstall the deployment:

helm uninstall $HELM_RELEASE -n $HELM_RELEASE

Governing Terms

If you download the software and materials as available from the NVIDIA AI product portfolio, use is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products; except for the model which is governed by the NVIDIA AI Foundation Models Community License Agreement, and the RAG dataset which is governed by the terms of the NVIDIA Asset License.

ADDITIONAL INFORMATION: For Llama model, Llama 3.1 Community License Agreement, Built with Llama; for NV-EmbedQA-E5-v5: MIT license; for NV-EmbedQA-Mistral7B-v2: Apache 2.0 license, and Snowflake arctic-embed-l: Apache 2.0 license.

If you download the software and materials as available from the NVIDIA Omniverse portfolio, use is governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA Omniverse; except for the model which is governed by the NVIDIA AI Foundation Models Community License Agreement, and the RAG dataset which is governed by the terms of the NVIDIA Asset License.