This documentation contains documentation for the NV-Ingest Helm charts.
[!Note] NV-Ingest is also known as NVIDIA Ingest and NeMo Retriever extraction.
Refer to our supported hardware/software configurations here.
NAMESPACE=nv-ingest
kubectl create namespace ${NAMESPACE}
# Nvidia nemo-microservices NGC repository
helm repo add nemo-microservices https://helm.ngc.nvidia.com/nvidia/nemo-microservices --username='$oauthtoken' --password=<NGC_API_KEY>
# Nvidia NIM NGC repository
helm repo add nvidia-nim https://helm.ngc.nvidia.com/nim/nvidia --username='$oauthtoken' --password=<NGC_API_KEY>
# Nvidia NIM baidu NGC repository
helm repo add baidu-nim https://helm.ngc.nvidia.com/nim/baidu --username='$oauthtoken' --password=<YOUR API KEY>
helm upgrade \
--install \
nv-ingest \
https://helm.ngc.nvidia.com/nvidia/nemo-microservices/charts/nv-ingest-25.3.0.tgz \
-n ${NAMESPACE} \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set ngcImagePullSecret.create=true \
--set ngcImagePullSecret.password="${NGC_API_KEY}" \
--set ngcApiSecret.create=true \
--set ngcApiSecret.password="${NGC_API_KEY}" \
--set image.repository="nvcr.io/nvidia/nemo-microservices/nv-ingest" \
--set image.tag="25.3.0"
Optionally you can create your own versions of the Secrets
if you do not want to use the creation via the helm chart.
NAMESPACE=nv-ingest
DOCKER_CONFIG='{"auths":{"nvcr.io":{"username":"$oauthtoken", "password":"'${NGC_API_KEY}'" }}}'
echo -n $DOCKER_CONFIG | base64 -w0
NGC_REGISTRY_PASSWORD=$(echo -n $DOCKER_CONFIG | base64 -w0 )
kubectl apply -n ${NAMESPACE} -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: ngcImagePullSecret
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: ${NGC_REGISTRY_PASSWORD}
EOF
kubectl create -n ${NAMESPACE} secret generic ngc-api --from-literal=NGC_API_KEY=${NGC_API_KEY}
Alternatively, you can also use an External Secret Store like Vault, the name of the secret name expected for the NGC API is ngc-api
and the secret name expected for NVCR is ngcImagePullSecret
.
In this case, make sure to remove the following from your helm command:
--set ngcImagePullSecret.create=true \
--set ngcImagePullSecret.password="${NGC_API_KEY}" \
--set ngcApiSecret.create=true \
--set ngcApiSecret.password="${NGC_API_KEY}" \
The PVC setup for minikube requires a little bit more configuration. Please follow the steps below if you are using minikube.
minikube start --driver docker --container-runtime docker --gpus all --nodes 3
minikube addons enable nvidia-gpu-device-plugin
minikube addons enable storage-provisioner-rancher
Jobs are submitted via the nv-ingest-cli
command.
NV-Ingest uses a HTTP/Rest based submission method. By default the Rest service runs on port 7670
.
First, build nv-ingest-cli
from the source to ensure you have the latest code.
For more information, refer to NV-Ingest-Client.
# Just to be cautious we remove any existing installation
pip uninstall nv-ingest-client
pip install nv-ingest-client==25.3.0
It is recommended that the end user provide a mechanism for Ingress
for the NV-Ingest pod.
You can test outside of your Kubernetes cluster by port-forwarding the NV-Ingest pod to your local environment.
Example:
You can find the name of your NV-Ingest pod you want to forward traffic to by running:
kubectl get pods -n <namespace> --no-headers -o custom-columns=":metadata.name"
kubectl port-forward -n ${NAMESPACE} service/nv-ingest 7670:7670
The output will look similar to the following but with different auto-generated sequences.
nv-ingest-674f6b7477-65nvm
nv-ingest-etcd-0
nv-ingest-milvus-standalone-7f8ffbdfbc-jpmlj
nv-ingest-minio-7cbd4f5b9d-99hl4
nv-ingest-opentelemetry-collector-7bb59d57fc-4h59q
nv-ingest-paddle-0
nv-ingest-redis-master-0
nv-ingest-redis-replicas-0
nv-ingest-yolox-0
nv-ingest-zipkin-77b5fc459f-ptsj6
kubectl port-forward -n ${NAMESPACE} nv-ingest-674f6b7477-65nvm 7670:7670
The NVIDIA GPU operator is used to make the Kubernetes cluster aware of the GPUs.
helm install --wait --generate-name --repo https://helm.ngc.nvidia.com/nvidia \
--namespace gpu-operator --create-namespace \
gpu-operator --set driver.enabled=false
Once this is installed we can check the number of available GPUs. Your ouput might differ depending on the number of GPUs on your nodes.
kubectl get nodes -o json | jq -r '.items[] | select(.metadata.name | test("-worker[0-9]*$")) | {name: .metadata.name, "nvidia.com/gpu": .status.allocatable["nvidia.com/gpu"]}'
{
"name": "one-worker-one-gpu-worker",
"nvidia.com/gpu": "1"
}
With the Nvidia GPU operator properly installed we want to enable time-slicing to allow more than one Pod to use this GPU. We do this by creating a time-slicing config file using time-slicing-config.yaml
.
kubectl apply -n gpu-operator -f time-slicing/time-slicing-config.yaml
Then update the cluster policy to use this new config.
kubectl patch clusterpolicies.nvidia.com/cluster-policy \
-n gpu-operator --type merge \
-p '{"spec": {"devicePlugin": {"config": {"name": "time-slicing-config", "default": "any"}}}}'
Now if we check the number of GPUs available we should see 16
, but note this is still just GPU 0
which we exposed to the node, just sliced into 16
.
kubectl get nodes -o json | jq -r '.items[] | select(.metadata.name | test("-worker[0-9]*$")) | {name: .metadata.name, "nvidia.com/gpu": .status.allocatable["nvidia.com/gpu"]}'
{
"name": "one-worker-one-gpu-worker",
"nvidia.com/gpu": "16"
}
Here is a sample invocation of a PDF extraction task using the port forward above:
mkdir -p ./processed_docs
nv-ingest-cli \
--doc /path/to/your/unique.pdf \
--output_directory ./processed_docs \
--task='extract:{"document_type": "pdf", "extract_text": true, "extract_images": true, "extract_tables": true, "extract_charts": true}' \
--client_host=localhost \
--client_port=7670
You can also use NV-Ingest's Python client API to interact with the service running in the cluster. Use the same host and port as in the previous nv-ingest-cli example.
Repository | Name | Version |
---|---|---|
alias:baidu-nim | paddleocr-nim(nvidia-nim-paddleocr) | 1.2.0 |
alias:nemo-microservices | nim-vlm-text-extraction(nim-vlm) | 1.2.0-ea-v2 |
alias:nvidia-nim | nvidia-nim-llama-32-nv-embedqa-1b-v2 | 1.5.0 |
alias:nvidia-nim | nemoretriever-graphic-elements-v1(nvidia-nim-nemoretriever-graphic-elements-v1) | 1.2.0 |
alias:nvidia-nim | nemoretriever-page-elements-v2(nvidia-nim-nemoretriever-page-elements-v2) | 1.2.0 |
alias:nvidia-nim | nemoretriever-table-structure-v1(nvidia-nim-nemoretriever-table-structure-v1) | 1.2.0 |
alias:nvidia-nim | text-embedding-nim(nvidia-nim-nv-embedqa-e5-v5) | 1.5.0 |
https://open-telemetry.github.io/opentelemetry-helm-charts | opentelemetry-collector | 0.78.1 |
https://zilliztech.github.io/milvus-helm | milvus | 4.1.11 |
https://zipkin.io/zipkin-helm | zipkin | 0.1.2 |
oci://registry-1.docker.io/bitnamicharts | redis | 19.1.3 |
Key | Type | Default | Description |
---|---|---|---|
affinity | object | {} |
|
autoscaling.enabled | bool | false |
|
autoscaling.maxReplicas | int | 100 |
|
autoscaling.metrics | list | [] |
|
autoscaling.minReplicas | int | 1 |
|
containerArgs | list | [] |
|
containerSecurityContext | object | {} |
|
envVars.AUDIO_GRPC_ENDPOINT | string | "audio:50051" |
|
envVars.AUDIO_INFER_PROTOCOL | string | "grpc" |
|
envVars.COMPONENTS_TO_READY_CHECK | string | "ALL" |
|
envVars.EMBEDDING_NIM_ENDPOINT | string | "http://nv-ingest-embedqa:8000/v1" |
|
envVars.EMBEDDING_NIM_MODEL_NAME | string | "nvidia/llama-3.2-nv-embedqa-1b-v2" |
|
envVars.INGEST_EDGE_BUFFER_SIZE | int | 64 |
|
envVars.MAX_INGEST_PROCESS_WORKERS | int | 16 |
|
envVars.MESSAGE_CLIENT_HOST | string | "nv-ingest-redis-master" |
|
envVars.MESSAGE_CLIENT_PORT | string | "6379" |
|
envVars.MILVUS_ENDPOINT | string | "http://nv-ingest-milvus:19530" |
|
envVars.MINIO_BUCKET | string | "nv-ingest" |
|
envVars.MINIO_INTERNAL_ADDRESS | string | "nv-ingest-minio:9000" |
|
envVars.MINIO_PUBLIC_ADDRESS | string | "http://localhost:9000" |
|
envVars.MODEL_PREDOWNLOAD_PATH | string | "/workspace/models/" |
|
envVars.MRC_IGNORE_NUMA_CHECK | int | 1 |
|
envVars.NEMORETRIEVER_PARSE_HTTP_ENDPOINT | string | "http://nim-vlm-text-extraction-nemoretriever-parse:8000/v1/chat/completions" |
|
envVars.NEMORETRIEVER_PARSE_INFER_PROTOCOL | string | "http" |
|
envVars.NV_INGEST_DEFAULT_TIMEOUT_MS | string | "1234" |
|
envVars.NV_INGEST_MAX_UTIL | int | 48 |
|
envVars.PADDLE_GRPC_ENDPOINT | string | "nv-ingest-paddle:8001" |
|
envVars.PADDLE_HTTP_ENDPOINT | string | "http://nv-ingest-paddle:8000/v1/infer" |
|
envVars.PADDLE_INFER_PROTOCOL | string | "grpc" |
|
envVars.REDIS_MORPHEUS_TASK_QUEUE | string | "morpheus_task_queue" |
|
envVars.VLM_CAPTION_ENDPOINT | string | "https://ai.api.nvidia.com/v1/gr/meta/llama-3.2-11b-vision-instruct/chat/completions" |
|
envVars.VLM_CAPTION_MODEL_NAME | string | "meta/llama-3.2-11b-vision-instruct" |
|
envVars.YOLOX_GRAPHIC_ELEMENTS_GRPC_ENDPOINT | string | "nemoretriever-graphic-elements-v1:8001" |
|
envVars.YOLOX_GRAPHIC_ELEMENTS_HTTP_ENDPOINT | string | "http://nemoretriever-graphic-elements-v1:8000/v1/infer" |
|
envVars.YOLOX_GRAPHIC_ELEMENTS_INFER_PROTOCOL | string | "grpc" |
|
envVars.YOLOX_GRPC_ENDPOINT | string | "nemoretriever-page-elements-v2:8001" |
|
envVars.YOLOX_HTTP_ENDPOINT | string | "http://nemoretriever-page-elements-v2:8000/v1/infer" |
|
envVars.YOLOX_INFER_PROTOCOL | string | "grpc" |
|
envVars.YOLOX_TABLE_STRUCTURE_GRPC_ENDPOINT | string | "nemoretriever-table-structure-v1:8001" |
|
envVars.YOLOX_TABLE_STRUCTURE_HTTP_ENDPOINT | string | "http://nemoretriever-table-structure-v1:8000/v1/infer" |
|
envVars.YOLOX_TABLE_STRUCTURE_INFER_PROTOCOL | string | "grpc" |
|
extraEnvVarsCM | string | "" |
|
extraEnvVarsSecret | string | "" |
|
extraVolumeMounts | object | {} |
|
extraVolumes | object | {} |
|
fullnameOverride | string | "" |
|
image.pullPolicy | string | "IfNotPresent" |
|
image.repository | string | "nvcr.io/nvidia/nemo-microservices/nv-ingest" |
|
image.tag | string | "25.3.0" |
|
imagePullSecrets[0].name | string | "ngc-api" |
|
imagePullSecrets[1].name | string | "ngc-secret" |
|
ingress.annotations | object | {} |
|
ingress.className | string | "" |
|
ingress.enabled | bool | false |
|
ingress.hosts[0].host | string | "chart-example.local" |
|
ingress.hosts[0].paths[0].path | string | "/" |
|
ingress.hosts[0].paths[0].pathType | string | "ImplementationSpecific" |
|
ingress.tls | list | [] |
|
livenessProbe.enabled | bool | false |
|
livenessProbe.failureThreshold | int | 20 |
|
livenessProbe.httpGet.path | string | "/v1/health/live" |
|
livenessProbe.httpGet.port | string | "http" |
|
livenessProbe.initialDelaySeconds | int | 120 |
|
livenessProbe.periodSeconds | int | 10 |
|
livenessProbe.successThreshold | int | 1 |
|
livenessProbe.timeoutSeconds | int | 20 |
|
llama-32-nv-rerankqa-1b-v2.autoscaling.enabled | bool | false |
|
llama-32-nv-rerankqa-1b-v2.autoscaling.maxReplicas | int | 10 |
|
llama-32-nv-rerankqa-1b-v2.autoscaling.metrics | list | [] |
|
llama-32-nv-rerankqa-1b-v2.autoscaling.minReplicas | int | 1 |
|
llama-32-nv-rerankqa-1b-v2.customArgs | list | [] |
|
llama-32-nv-rerankqa-1b-v2.customCommand | list | [] |
|
llama-32-nv-rerankqa-1b-v2.deployed | bool | false |
|
llama-32-nv-rerankqa-1b-v2.env[0].name | string | "NIM_HTTP_API_PORT" |
|
llama-32-nv-rerankqa-1b-v2.env[0].value | string | "8000" |
|
llama-32-nv-rerankqa-1b-v2.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
llama-32-nv-rerankqa-1b-v2.env[1].value | string | "1" |
|
llama-32-nv-rerankqa-1b-v2.image.repository | string | "nvcr.io/nim/nvidia/llama-3.2-nv-rerankqa-1b-v2" |
|
llama-32-nv-rerankqa-1b-v2.image.tag | string | "1.3.1" |
|
llama-32-nv-rerankqa-1b-v2.nim.grpcPort | int | 8001 |
|
llama-32-nv-rerankqa-1b-v2.nim.logLevel | string | "INFO" |
|
llama-32-nv-rerankqa-1b-v2.podSecurityContext.fsGroup | int | 1000 |
|
llama-32-nv-rerankqa-1b-v2.podSecurityContext.runAsGroup | int | 1000 |
|
llama-32-nv-rerankqa-1b-v2.podSecurityContext.runAsUser | int | 1000 |
|
llama-32-nv-rerankqa-1b-v2.replicaCount | int | 1 |
|
llama-32-nv-rerankqa-1b-v2.service.grpcPort | int | 8001 |
|
llama-32-nv-rerankqa-1b-v2.service.httpPort | int | 8000 |
|
llama-32-nv-rerankqa-1b-v2.service.metricsPort | int | 0 |
|
llama-32-nv-rerankqa-1b-v2.service.name | string | "llama-32-nv-rerankqa-1b-v2" |
|
llama-32-nv-rerankqa-1b-v2.service.type | string | "ClusterIP" |
|
llama-32-nv-rerankqa-1b-v2.serviceAccount.create | bool | false |
|
llama-32-nv-rerankqa-1b-v2.serviceAccount.name | string | "" |
|
llama-32-nv-rerankqa-1b-v2.statefuleSet.enabled | bool | false |
|
logLevel | string | "DEFAULT" |
|
milvus.cluster.enabled | bool | false |
|
milvus.etcd.persistence.storageClass | string | nil |
|
milvus.etcd.replicaCount | int | 1 |
|
milvus.image.all.repository | string | "milvusdb/milvus" |
|
milvus.image.all.tag | string | "v2.5.3-gpu" |
|
milvus.minio.bucketName | string | "a-bucket" |
|
milvus.minio.mode | string | "standalone" |
|
milvus.minio.persistence.size | string | "50Gi" |
|
milvus.minio.persistence.storageClass | string | nil |
|
milvus.pulsar.enabled | bool | false |
|
milvus.standalone.extraEnv[0].name | string | "LOG_LEVEL" |
|
milvus.standalone.extraEnv[0].value | string | "error" |
|
milvus.standalone.persistence.persistentVolumeClaim.size | string | "50Gi" |
|
milvus.standalone.persistence.persistentVolumeClaim.storageClass | string | nil |
|
milvus.standalone.resources.limits."nvidia.com/gpu" | int | 1 |
|
milvusDeployed | bool | true |
|
nameOverride | string | "" |
|
nemo.groupID | string | "1000" |
|
nemo.userID | string | "1000" |
|
nemoretriever-graphic-elements-v1.autoscaling.enabled | bool | false |
|
nemoretriever-graphic-elements-v1.autoscaling.maxReplicas | int | 10 |
|
nemoretriever-graphic-elements-v1.autoscaling.metrics | list | [] |
|
nemoretriever-graphic-elements-v1.autoscaling.minReplicas | int | 1 |
|
nemoretriever-graphic-elements-v1.customArgs | list | [] |
|
nemoretriever-graphic-elements-v1.customCommand | list | [] |
|
nemoretriever-graphic-elements-v1.deployed | bool | true |
|
nemoretriever-graphic-elements-v1.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nemoretriever-graphic-elements-v1.env[0].value | string | "8000" |
|
nemoretriever-graphic-elements-v1.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
nemoretriever-graphic-elements-v1.env[1].value | string | "1" |
|
nemoretriever-graphic-elements-v1.image.repository | string | "nvcr.io/nim/nvidia/nemoretriever-graphic-elements-v1" |
|
nemoretriever-graphic-elements-v1.image.tag | string | "1.2.0" |
|
nemoretriever-graphic-elements-v1.nim.grpcPort | int | 8001 |
|
nemoretriever-graphic-elements-v1.nim.logLevel | string | "INFO" |
|
nemoretriever-graphic-elements-v1.podSecurityContext.fsGroup | int | 1000 |
|
nemoretriever-graphic-elements-v1.podSecurityContext.runAsGroup | int | 1000 |
|
nemoretriever-graphic-elements-v1.podSecurityContext.runAsUser | int | 1000 |
|
nemoretriever-graphic-elements-v1.replicaCount | int | 1 |
|
nemoretriever-graphic-elements-v1.service.grpcPort | int | 8001 |
|
nemoretriever-graphic-elements-v1.service.httpPort | int | 8000 |
|
nemoretriever-graphic-elements-v1.service.metricsPort | int | 0 |
|
nemoretriever-graphic-elements-v1.service.name | string | "nemoretriever-graphic-elements-v1" |
|
nemoretriever-graphic-elements-v1.service.type | string | "ClusterIP" |
|
nemoretriever-graphic-elements-v1.serviceAccount.create | bool | false |
|
nemoretriever-graphic-elements-v1.serviceAccount.name | string | "" |
|
nemoretriever-graphic-elements-v1.statefuleSet.enabled | bool | false |
|
nemoretriever-page-elements-v2.autoscaling.enabled | bool | false |
|
nemoretriever-page-elements-v2.autoscaling.maxReplicas | int | 10 |
|
nemoretriever-page-elements-v2.autoscaling.metrics | list | [] |
|
nemoretriever-page-elements-v2.autoscaling.minReplicas | int | 1 |
|
nemoretriever-page-elements-v2.deployed | bool | true |
|
nemoretriever-page-elements-v2.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nemoretriever-page-elements-v2.env[0].value | string | "8000" |
|
nemoretriever-page-elements-v2.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
nemoretriever-page-elements-v2.env[1].value | string | "1" |
|
nemoretriever-page-elements-v2.image.pullPolicy | string | "IfNotPresent" |
|
nemoretriever-page-elements-v2.image.repository | string | "nvcr.io/nim/nvidia/nemoretriever-page-elements-v2" |
|
nemoretriever-page-elements-v2.image.tag | string | "1.2.0" |
|
nemoretriever-page-elements-v2.nim.grpcPort | int | 8001 |
|
nemoretriever-page-elements-v2.nim.logLevel | string | "INFO" |
|
nemoretriever-page-elements-v2.podSecurityContext.fsGroup | int | 1000 |
|
nemoretriever-page-elements-v2.podSecurityContext.runAsGroup | int | 1000 |
|
nemoretriever-page-elements-v2.podSecurityContext.runAsUser | int | 1000 |
|
nemoretriever-page-elements-v2.replicaCount | int | 1 |
|
nemoretriever-page-elements-v2.service.grpcPort | int | 8001 |
|
nemoretriever-page-elements-v2.service.httpPort | int | 8000 |
|
nemoretriever-page-elements-v2.service.metricsPort | int | 0 |
|
nemoretriever-page-elements-v2.service.name | string | "nemoretriever-page-elements-v2" |
|
nemoretriever-page-elements-v2.service.type | string | "ClusterIP" |
|
nemoretriever-page-elements-v2.serviceAccount.create | bool | false |
|
nemoretriever-page-elements-v2.serviceAccount.name | string | "" |
|
nemoretriever-page-elements-v2.statefuleSet.enabled | bool | false |
|
nemoretriever-table-structure-v1.autoscaling.enabled | bool | false |
|
nemoretriever-table-structure-v1.autoscaling.maxReplicas | int | 10 |
|
nemoretriever-table-structure-v1.autoscaling.metrics | list | [] |
|
nemoretriever-table-structure-v1.autoscaling.minReplicas | int | 1 |
|
nemoretriever-table-structure-v1.customArgs | list | [] |
|
nemoretriever-table-structure-v1.customCommand | list | [] |
|
nemoretriever-table-structure-v1.deployed | bool | true |
|
nemoretriever-table-structure-v1.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nemoretriever-table-structure-v1.env[0].value | string | "8000" |
|
nemoretriever-table-structure-v1.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
nemoretriever-table-structure-v1.env[1].value | string | "1" |
|
nemoretriever-table-structure-v1.image.repository | string | "nvcr.io/nim/nvidia/nemoretriever-table-structure-v1" |
|
nemoretriever-table-structure-v1.image.tag | string | "1.2.0" |
|
nemoretriever-table-structure-v1.nim.grpcPort | int | 8001 |
|
nemoretriever-table-structure-v1.nim.logLevel | string | "INFO" |
|
nemoretriever-table-structure-v1.podSecurityContext.fsGroup | int | 1000 |
|
nemoretriever-table-structure-v1.podSecurityContext.runAsGroup | int | 1000 |
|
nemoretriever-table-structure-v1.podSecurityContext.runAsUser | int | 1000 |
|
nemoretriever-table-structure-v1.replicaCount | int | 1 |
|
nemoretriever-table-structure-v1.service.grpcPort | int | 8001 |
|
nemoretriever-table-structure-v1.service.httpPort | int | 8000 |
|
nemoretriever-table-structure-v1.service.metricsPort | int | 0 |
|
nemoretriever-table-structure-v1.service.name | string | "nemoretriever-table-structure-v1" |
|
nemoretriever-table-structure-v1.service.type | string | "ClusterIP" |
|
nemoretriever-table-structure-v1.serviceAccount.create | bool | false |
|
nemoretriever-table-structure-v1.serviceAccount.name | string | "" |
|
nemoretriever-table-structure-v1.statefuleSet.enabled | bool | false |
|
ngcApiSecret.create | bool | false |
|
ngcApiSecret.password | string | "" |
|
ngcImagePullSecret.create | bool | false |
|
ngcImagePullSecret.name | string | "ngcImagePullSecret" |
|
ngcImagePullSecret.password | string | "" |
|
ngcImagePullSecret.registry | string | "nvcr.io" |
|
ngcImagePullSecret.username | string | "$oauthtoken" |
|
nim-vlm-image-captioning.customArgs | list | [] |
|
nim-vlm-image-captioning.customCommand | list | [] |
|
nim-vlm-image-captioning.deployed | bool | false |
|
nim-vlm-image-captioning.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nim-vlm-image-captioning.env[0].value | string | "8000" |
|
nim-vlm-image-captioning.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
nim-vlm-image-captioning.env[1].value | string | "1" |
|
nim-vlm-image-captioning.fullnameOverride | string | "nim-vlm-image-captioning" |
|
nim-vlm-image-captioning.image.repository | string | "nvcr.io/nim/meta/llama-3.2-11b-vision-instruct" |
|
nim-vlm-image-captioning.image.tag | string | "latest" |
|
nim-vlm-image-captioning.nim.grpcPort | int | 8001 |
|
nim-vlm-image-captioning.nim.logLevel | string | "INFO" |
|
nim-vlm-image-captioning.podSecurityContext.fsGroup | int | 1000 |
|
nim-vlm-image-captioning.podSecurityContext.runAsGroup | int | 1000 |
|
nim-vlm-image-captioning.podSecurityContext.runAsUser | int | 1000 |
|
nim-vlm-image-captioning.replicaCount | int | 1 |
|
nim-vlm-image-captioning.service.grpcPort | int | 8001 |
|
nim-vlm-image-captioning.service.httpPort | int | 8000 |
|
nim-vlm-image-captioning.service.metricsPort | int | 0 |
|
nim-vlm-image-captioning.service.name | string | "nim-vlm-image-captioning" |
|
nim-vlm-image-captioning.service.type | string | "ClusterIP" |
|
nim-vlm-text-extraction.customArgs | list | [] |
|
nim-vlm-text-extraction.customCommand | list | [] |
|
nim-vlm-text-extraction.deployed | bool | false |
|
nim-vlm-text-extraction.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nim-vlm-text-extraction.env[0].value | string | "8000" |
|
nim-vlm-text-extraction.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
nim-vlm-text-extraction.env[1].value | string | "1" |
|
nim-vlm-text-extraction.fullnameOverride | string | "nim-vlm-text-extraction-nemoretriever-parse" |
|
nim-vlm-text-extraction.image.repository | string | "nvcr.io/nvidia/nemo-microservices/nemoretriever-parse" |
|
nim-vlm-text-extraction.image.tag | string | "1.2.0ea" |
|
nim-vlm-text-extraction.nim.grpcPort | int | 8001 |
|
nim-vlm-text-extraction.nim.logLevel | string | "INFO" |
|
nim-vlm-text-extraction.podSecurityContext.fsGroup | int | 1000 |
|
nim-vlm-text-extraction.podSecurityContext.runAsGroup | int | 1000 |
|
nim-vlm-text-extraction.podSecurityContext.runAsUser | int | 1000 |
|
nim-vlm-text-extraction.replicaCount | int | 1 |
|
nim-vlm-text-extraction.service.grpcPort | int | 8001 |
|
nim-vlm-text-extraction.service.httpPort | int | 8000 |
|
nim-vlm-text-extraction.service.metricsPort | int | 0 |
|
nim-vlm-text-extraction.service.name | string | "nim-vlm-text-extraction-nemoretriever-parse" |
|
nim-vlm-text-extraction.service.type | string | "ClusterIP" |
|
nodeSelector | object | {} |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.autoscaling.enabled | bool | false |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.autoscaling.maxReplicas | int | 10 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.autoscaling.metrics | list | [] |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.autoscaling.minReplicas | int | 1 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.customArgs | list | [] |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.customCommand | list | [] |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.deployed | bool | true |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.env[0].name | string | "NIM_HTTP_API_PORT" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.env[0].value | string | "8000" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.env[1].name | string | "NIM_TRITON_MAX_BATCH_SIZE" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.env[1].value | string | "1" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.fullnameOverride | string | "nv-ingest-embedqa" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.image.repository | string | "nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.image.tag | string | "1.5.0" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.nim.grpcPort | int | 8001 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.nim.logLevel | string | "INFO" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.podSecurityContext.fsGroup | int | 1000 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.podSecurityContext.runAsGroup | int | 1000 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.podSecurityContext.runAsUser | int | 1000 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.replicaCount | int | 1 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.service.grpcPort | int | 8001 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.service.httpPort | int | 8000 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.service.metricsPort | int | 0 |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.service.name | string | "nv-ingest-embedqa" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.service.type | string | "ClusterIP" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.serviceAccount.create | bool | false |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.serviceAccount.name | string | "" |
|
nvidia-nim-llama-32-nv-embedqa-1b-v2.statefuleSet.enabled | bool | false |
|
opentelemetry-collector.config.exporters.debug.verbosity | string | "detailed" |
|
opentelemetry-collector.config.exporters.zipkin.endpoint | string | "http://nv-ingest-zipkin:9411/api/v2/spans" |
|
opentelemetry-collector.config.extensions.health_check | object | {} |
|
opentelemetry-collector.config.extensions.zpages.endpoint | string | "0.0.0.0:55679" |
|
opentelemetry-collector.config.processors.batch | object | {} |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].name | string | "drop_noisy_traces_url" |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].string_attribute.enabled_regex_matching | bool | true |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].string_attribute.invert_match | bool | true |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].string_attribute.key | string | "http.target" |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].string_attribute.values[0] | string | "\\/health" |
|
opentelemetry-collector.config.processors.tail_sampling.policies[0].type | string | "string_attribute" |
|
opentelemetry-collector.config.processors.transform.trace_statements[0].context | string | "span" |
|
opentelemetry-collector.config.processors.transform.trace_statements[0].statements[0] | string | "set(status.code, 1) where attributes[\"http.path\"] == \"/health\"" |
|
opentelemetry-collector.config.processors.transform.trace_statements[0].statements[1] | string | "replace_match(attributes[\"http.route\"], \"/v1\", attributes[\"http.target\"]) where attributes[\"http.target\"] != nil" |
|
opentelemetry-collector.config.processors.transform.trace_statements[0].statements[2] | string | "replace_pattern(name, \"/v1\", attributes[\"http.route\"]) where attributes[\"http.route\"] != nil" |
|
opentelemetry-collector.config.processors.transform.trace_statements[0].statements[3] | string | "set(name, Concat([name, attributes[\"http.url\"]], \" \")) where name == \"POST\"" |
|
opentelemetry-collector.config.receivers.otlp.protocols.grpc.endpoint | string | "${env:MY_POD_IP}:4317" |
|
opentelemetry-collector.config.receivers.otlp.protocols.http.cors.allowed_origins[0] | string | "*" |
|
opentelemetry-collector.config.service.extensions[0] | string | "zpages" |
|
opentelemetry-collector.config.service.extensions[1] | string | "health_check" |
|
opentelemetry-collector.config.service.pipelines.logs.exporters[0] | string | "debug" |
|
opentelemetry-collector.config.service.pipelines.logs.processors[0] | string | "batch" |
|
opentelemetry-collector.config.service.pipelines.logs.receivers[0] | string | "otlp" |
|
opentelemetry-collector.config.service.pipelines.metrics.exporters[0] | string | "debug" |
|
opentelemetry-collector.config.service.pipelines.metrics.processors[0] | string | "batch" |
|
opentelemetry-collector.config.service.pipelines.metrics.receivers[0] | string | "otlp" |
|
opentelemetry-collector.config.service.pipelines.traces.exporters[0] | string | "debug" |
|
opentelemetry-collector.config.service.pipelines.traces.exporters[1] | string | "zipkin" |
|
opentelemetry-collector.config.service.pipelines.traces.processors[0] | string | "tail_sampling" |
|
opentelemetry-collector.config.service.pipelines.traces.processors[1] | string | "transform" |
|
opentelemetry-collector.config.service.pipelines.traces.receivers[0] | string | "otlp" |
|
opentelemetry-collector.mode | string | "deployment" |
|
otelDeployed | bool | true |
|
otelEnabled | bool | true |
|
otelEnvVars.OTEL_LOGS_EXPORTER | string | "none" |
|
otelEnvVars.OTEL_METRICS_EXPORTER | string | "otlp" |
|
otelEnvVars.OTEL_PROPAGATORS | string | "tracecontext,baggage" |
|
otelEnvVars.OTEL_PYTHON_EXCLUDED_URLS | string | "health" |
|
otelEnvVars.OTEL_RESOURCE_ATTRIBUTES | string | "deployment.environment=$(NAMESPACE)" |
|
otelEnvVars.OTEL_SERVICE_NAME | string | "nemo-retrieval-service" |
|
otelEnvVars.OTEL_TRACES_EXPORTER | string | "otlp" |
|
paddleocr-nim."paddleocr-nim.deployed" | bool | true |
|
paddleocr-nim.autoscaling.enabled | bool | false |
|
paddleocr-nim.autoscaling.maxReplicas | int | 10 |
|
paddleocr-nim.autoscaling.metrics | list | [] |
|
paddleocr-nim.autoscaling.minReplicas | int | 1 |
|
paddleocr-nim.customArgs | list | [] |
|
paddleocr-nim.customCommand | list | [] |
|
paddleocr-nim.env[0].name | string | "NIM_HTTP_API_PORT" |
|
paddleocr-nim.env[0].value | string | "8000" |
|
paddleocr-nim.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
paddleocr-nim.env[1].value | string | "1" |
|
paddleocr-nim.fullnameOverride | string | "nv-ingest-paddle" |
|
paddleocr-nim.image.repository | string | "nvcr.io/nim/baidu/paddleocr" |
|
paddleocr-nim.image.tag | string | "1.2.0" |
|
paddleocr-nim.nim.grpcPort | int | 8001 |
|
paddleocr-nim.nim.logLevel | string | "INFO" |
|
paddleocr-nim.podSecurityContext.fsGroup | int | 1000 |
|
paddleocr-nim.podSecurityContext.runAsGroup | int | 1000 |
|
paddleocr-nim.podSecurityContext.runAsUser | int | 1000 |
|
paddleocr-nim.replicaCount | int | 1 |
|
paddleocr-nim.service.grpcPort | int | 8001 |
|
paddleocr-nim.service.httpPort | int | 8000 |
|
paddleocr-nim.service.metricsPort | int | 0 |
|
paddleocr-nim.service.name | string | "nv-ingest-paddle" |
|
paddleocr-nim.service.type | string | "ClusterIP" |
|
paddleocr-nim.serviceAccount.create | bool | false |
|
paddleocr-nim.serviceAccount.name | string | "" |
|
paddleocr-nim.statefuleSet.enabled | bool | false |
|
podAnnotations."traffic.sidecar.istio.io/excludeOutboundPorts" | string | "8007" |
|
podLabels | object | {} |
|
podSecurityContext.fsGroup | int | 1000 |
|
readinessProbe.enabled | bool | false |
|
readinessProbe.failureThreshold | int | 220 |
|
readinessProbe.httpGet.path | string | "/v1/health/ready" |
|
readinessProbe.httpGet.port | string | "http" |
|
readinessProbe.initialDelaySeconds | int | 120 |
|
readinessProbe.periodSeconds | int | 30 |
|
readinessProbe.successThreshold | int | 1 |
|
readinessProbe.timeoutSeconds | int | 10 |
|
redis.auth.enabled | bool | false |
|
redis.master.configmap | string | "protected-mode no" |
|
redis.master.persistence.size | string | "50Gi" |
|
redis.master.resources.limits.memory | string | "12Gi" |
|
redis.master.resources.requests.memory | string | "6Gi" |
|
redis.replica.persistence.size | string | "50Gi" |
|
redis.replica.replicaCount | int | 1 |
|
redis.replica.resources.limits.memory | string | "12Gi" |
|
redis.replica.resources.requests.memory | string | "6Gi" |
|
redisDeployed | bool | true |
|
replicaCount | int | 1 |
|
resources.limits.cpu | string | "48000m" |
|
resources.limits.memory | string | "200Gi" |
|
resources.requests.cpu | string | "24000m" |
|
resources.requests.memory | string | "24Gi" |
|
service.annotations | object | {} |
|
service.labels | object | {} |
|
service.name | string | "" |
|
service.nodePort | string | nil |
|
service.port | int | 7670 |
|
service.type | string | "ClusterIP" |
|
serviceAccount.annotations | object | {} |
|
serviceAccount.automount | bool | true |
|
serviceAccount.create | bool | true |
|
serviceAccount.name | string | "" |
|
text-embedding-nim.autoscaling.enabled | bool | false |
|
text-embedding-nim.autoscaling.maxReplicas | int | 10 |
|
text-embedding-nim.autoscaling.metrics | list | [] |
|
text-embedding-nim.autoscaling.minReplicas | int | 1 |
|
text-embedding-nim.customArgs | list | [] |
|
text-embedding-nim.customCommand | list | [] |
|
text-embedding-nim.deployed | bool | false |
|
text-embedding-nim.env[0].name | string | "NIM_HTTP_API_PORT" |
|
text-embedding-nim.env[0].value | string | "8000" |
|
text-embedding-nim.env[1].name | string | "NIM_TRITON_MODEL_BATCH_SIZE" |
|
text-embedding-nim.env[1].value | string | "1" |
|
text-embedding-nim.fullnameOverride | string | "nv-ingest-embedqa" |
|
text-embedding-nim.image.repository | string | "nvcr.io/nim/nvidia/nv-embedqa-e5-v5" |
|
text-embedding-nim.image.tag | string | "1.5.0" |
|
text-embedding-nim.nim.grpcPort | int | 8001 |
|
text-embedding-nim.nim.logLevel | string | "INFO" |
|
text-embedding-nim.podSecurityContext.fsGroup | int | 1000 |
|
text-embedding-nim.podSecurityContext.runAsGroup | int | 1000 |
|
text-embedding-nim.podSecurityContext.runAsUser | int | 1000 |
|
text-embedding-nim.replicaCount | int | 1 |
|
text-embedding-nim.service.grpcPort | int | 8001 |
|
text-embedding-nim.service.httpPort | int | 8000 |
|
text-embedding-nim.service.metricsPort | int | 0 |
|
text-embedding-nim.service.name | string | "nv-ingest-embedqa" |
|
text-embedding-nim.service.type | string | "ClusterIP" |
|
text-embedding-nim.serviceAccount.create | bool | false |
|
text-embedding-nim.serviceAccount.name | string | "" |
|
text-embedding-nim.statefuleSet.enabled | bool | false |
|
tmpDirSize | string | "50Gi" |
|
tolerations | list | [] |
|
zipkinDeployed | bool | true |