Deploy Omniverse Farm for kubernetes
Repository | Name | Version |
---|---|---|
https://charts.bitnami.com/bitnami | mysql | 8.9.6 |
https://charts.bitnami.com/bitnami | redis | 16.9.1 |
This chart assumes that a Kubernetes cluster is already available and configured.
Depending on the ingress controller and namespace some changes may be required.
In order to run GPU workloads, the cluster must have the NVIDIA Device Plugin installed and have access to nvidia.com/gpu
resources.
Release notes for each version can be found on this section.
Release 1.0.0
brings various important updates and bug fixes, containing a database migration affecting the Tasks service.
By default all migrations will be run during an upgrade, to disable use the global.skip_db_migrations=true
setting.
Several settings have been added or updated.
settings.service.url_prefix
: changed default from /queue/management/settings
to /queue
global.skip_db_migrations
: new setting added defaulted to false
to skipping database migrations during install.controller.serviceConfig.public_to_private_source_queue_hostnames
: new setting for providing a mapping from public facing DNS to internal DNS for Farm Tasks servicelogs.serviceConfig.logging_store_class
: new setting added defaulted to the existing class omni.services.farm.logging.store.memory.MemoryLogStore
logs.serviceConfig.logging_store_args
: new setting added defaulted to {}
.Fetch the helm chart using Fetch Command provided above, by default this command shows the latest version, if other versions are available they will be listed in File Browser Tab.
helm fetch \
https://helm.ngc.nvidia.com/nvidia/omniverse/charts/omniverse-farm-1.0.0.tgz \
--username='$oauthtoken' \
--password=YOUR_API_KEY
Before installing it's worth considering image pull secrets and dashboard access.
helm upgrade \
--install \
omniverse-farm \
omniverse-farm-1.0.0.tgz \
--create-namespace \
--namespace ov-farm \
--set global.imagePullSecrets[0].name=my-registry-secret
Farm container images are hosted on nvcr.io behind a private repository. Therefore, you will need to create the image pull secret with the same NGC Token used to pull the chart.
NOTE: if the farm namespace does not exist, create it prior to creating the secret.
kubectl create namespace ov-farm
kubectl create secret docker-registry \
my-registry-secret \
--namespace ov-farm \
--docker-server="nvcr.io" \
--docker-username='$oauthtoken' \
--docker-password=YOUR_API_KEY
Then during install, specify the image pull secret globally.
--set global.imagePullSecrets[0].name=my-registry-secret
The default ingress class used in this chart is nginx, but this can be changed by supplying --set global.ingress.annotations."kubernetes\.io/ingress\.class"="INGRESS-CONTROLLER"
The following two examples make use of nginx ingress, but other ingress providers can be utilized such as traefik.
It's worth getting familiar with the set of options available under the global.ingress
setting.
This example uses a simple local DNS setup.
For production setups you'll need to configure tls cert secrets for the hosts (this is beyond the scope of this guide).
To use a custom local DNS, for example farm.ov.local
, we'll need to update /etc/hosts and map it to the Node's Internal IP.
The resulting values to specify during helm install
will be --set global.ingress.host="farm.ov.local"
This example also uses nginx ingress:
helm install ingress-nginx \
ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.service.type=NodePort \
--set controller.service.nodePorts.http=32080
helm upgrade \
--install \
omniverse-farm \
omniverse-farm-1.0.0.tgz \
--create-namespace \
--namespace ov-farm \
--set global.ingress.annotations."kubernetes\.io/ingress\.class"="nginx" \
--set global.ingress.host="farm.ov.local"
kubectl get nodes -o=wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ov-farm-local-control-plane Ready control-plane,master 3h22m v1.23.4 172.25.0.2 <none> Ubuntu 21.10 5.15.0-27-generic containerd://1.5.10
Add an entry to your /etc/hosts:
sudo vi /etc/hosts
172.25.0.2 farm.ov.local
Get the port your NGINX is exposing in the NodePort service (32080 in this case)
kubectl get svc ingress-nginx-controller -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller NodePort 10.96.68.95 <none> 80:32080/TCP,443:31312/TCP 47m
A quick check confirms we can reach the Tasks service:
curl http://farm.ov.local:32080/queue/management/tasks/status
"OK"
Farm Dashboard reacheable via: http://farm.ov.local:32080/queue/management/dashboard
When the ingress host is not specified, the ingress controller will default to using the controller service's IP which is then routed via the Node's IP.
Here's a setup using nginx ingress without a hostname.
helm install ingress-nginx \
ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.service.type=NodePort \
--set controller.service.nodePorts.http=32080
helm upgrade \
--install \
omniverse-farm \
omniverse-farm-1.0.0.tgz \
--create-namespace \
--namespace ov-farm \
--set global.ingress.annotations."kubernetes\.io/ingress\.class"="nginx" \
--set global.ingress.host=null
kubectl get svc ingress-nginx-controller -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller NodePort 10.43.230.160 <none> 80:32080/TCP,443:32681/TCP
kubectl -n ov-farm get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
omniverse-farm <none> * 10.43.230.160 80 10m
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k3d-ov-local-server-0 Ready control-plane,master 21h v1.23.6+k3s1 172.18.0.3 <none> K3s dev 5.15.0-43-generic containerd://1.5.5
From above, port 80 is mapped to 32080 (and 443 to 32681) on the ingress-nginx-controller, and the node's internal IP is 172.18.0.3
A quick check confirms we can reach the Tasks service:
curl http://172.18.0.3:32080/queue/management/tasks/status
"OK"
Farm Dashboard reacheable via: http://172.18.0.3:32080/queue/management/dashboard
Fetch the helm chart using Fetch Command provided above, by default this command shows the latest version, if other versions are available they will be listed in File Browser Tab.
helm fetch \
https://helm.ngc.nvidia.com/nvidia/omniverse/charts/omniverse-farm-1.0.0.tgz \
--username='$oauthtoken' \
--password=YOUR_API_KEY
then install:
helm upgrade \
--install \
omniverse-farm \
omniverse-farm-1.0.0.tgz \
--create-namespace \
--namespace ov-farm
Find the revision you want to rollback to:
helm history omniverse-farm --namespace ov-farm
Then rollback to that revision:
helm rollback omniverse-farm --namespace ov-farm <<REVISION>>
It is sometimes useful to retrieve all of the container images and copy them into a private registry where it might be required in airgapped environments.
To begin the process, retrieve the helm chart and inspect the values (you can also refer to the values table below for the container image tags).
To view repository and tags belonging to farm container images, run the helm show values
command on the chart.
There should be 3 unique container images; farm-queue, farm-agent-k8s and busybox (for the init containers).
Additionally, there will be the redis and mysql container images to copy as well (which are not shown on the output of helm show values
).
IMPORTANT: Farm Job definitions refer to container images as well, those container images will need to be copied to private registry for airgapped environments in order to run tasks from them. For example, the create-render job definition refers to nvcr.io/nvidia/omniverse/create-render:2022.2.1 container image, the steps for copying that image are the same as described further below. In addition, the Job definition will need to be updated to reflect the new repository.
NOTE: the example below is for version 0.3.2
Please ensure you do not copy the tags below as they may be different in other versions of the chart.
helm show values omniverse-farm-0.3.2.tgz | egrep 'repository\:|tag\:'
repository: "nvcr.io/nvidia/omniverse/farm-queue"
tag: "104.0.1"
repository: "busybox"
tag: "1.35"
repository: "nvcr.io/nvidia/omniverse/farm-agent-k8s"
tag: "104.0.3"
repository: "busybox"
tag: "1.35"
...truncated...
Copy the images from nvcr.io to the private registry.
It will be simpler to track images if the original tag version is preserved.
The registry login password for nvcr.io is the NGC token used to pull the helm chart.
docker login -u '$oauthtoken' nvcr.io
Copy farm-queue image:
docker pull nvcr.io/nvidia/omniverse/farm-queue:104.0.1
docker tag nvcr.io/nvidia/omniverse/farm-queue:104.0.1 myregistry.com/example-repo/farm-queue:104.0.1
docker push myregistry.com/example-repo/farm-queue:104.0.1
Copy farm-agent-k8s image:
docker pull nvcr.io/nvidia/omniverse/farm-agent-k8s:104.0.3
docker tag nvcr.io/nvidia/omniverse/farm-agent-k8s:104.0.3 myregistry.com/example-repo/farm-agent-k8s:104.0.3
docker push myregistry.com/example-repo/farm-agent-k8s:104.0.3
Copy init container image:
docker pull busybox:1.35
docker tag busybox:1.35 myregistry.com/example-repo/busybox:1.35
docker push myregistry.com/example-repo/busybox:1.35
Copy redis image, to get the redis container image refer to the values https://github.com/bitnami/charts/blob/main/bitnami/redis/values.yaml for the version of redis used in this chart (see Requirements section above)
docker pull docker.io/bitnami/redis:6.2.7-debian-10-r0
docker tag docker.io/bitnami/redis:6.2.7-debian-10-r0 myregistry.com/example-repo/redis:6.2.7-debian-10-r0
docker push myregistry.com/example-repo/redis:6.2.7-debian-10-r0
Copy mysql image, to get the mysql container image refer to the values https://github.com/bitnami/charts/blob/main/bitnami/mysql/values.yaml for the version of mysql used in this chart (see Requirements section above)
docker pull docker.io/bitnami/mysql:8.0.29-debian-10-r2
docker tag docker.io/bitnami/mysql:8.0.29-debian-10-r2 myregistry.com/example-repo/mysql:8.0.29-debian-10-r2
docker push myregistry.com/example-repo/mysql:8.0.29-debian-10-r2
Create render image, to get the container image refer to the create-render job definition
docker pull nvcr.io/nvidia/omniverse/create-render:2022.2.1
docker tag nvcr.io/nvidia/omniverse/create-render:2022.2.1 myregistry.com/example-repo/create-render:2022.2.1
docker push myregistry.com/example-repo/create-render:2022.2.1
IMPORTANT: The create-render job definition will need to point to the private registry, update the job definition for the container and re-upload it to Farm. For more details on uploading Job definitions refer to this guide.
[job.create-render]
job_type = "kit-service"
name = "create-render"
... truncated ..
container = "myregistry.com/example-repo/create-render:2022.2.1"
... truncated ..
Now that the container images have been copied to the private registry, it is now a matter of specifying the new repository/registry for the container images when installing the chart.
(my-farm-values.yaml)
# NOTE: if the private registry is behind auth, specify the global.imagePullSecrets and controller.serviceConfig.k8s.jobTemplateSpecOverrides.imagePullSecrets
# global:
# imagePullSecrets:
# - name: my-private-registry-imagesecret
# assuming that images will reuse same tag version in the airgapped repo, hence only needing to specify the repository.
agents:
image:
repository: myregistry.com/example-repo/farm-queue
initImage:
repository: myregistry.com/example-repo/busybox
controller:
image:
repository: myregistry.com/example-repo/farm-agent-k8s
initImage:
repository: myregistry.com/example-repo/busybox
# serviceConfig:
# k8s:
# jobTemplateSpecOverrides:
# imagePullSecrets:
# - name: my-private-registry-imagesecret
dashboard:
image:
repository: myregistry.com/example-repo/farm-queue
jobs:
image:
repository: myregistry.com/example-repo/farm-queue
initImage:
repository: myregistry.com/example-repo/busybox
logs:
image:
repository: myregistry.com/example-repo/farm-queue
metrics:
image:
repository: myregistry.com/example-repo/farm-queue
retries:
image:
repository: myregistry.com/example-repo/farm-queue
initImage:
repository: myregistry.com/example-repo/busybox
settings:
image:
repository: myregistry.com/example-repo/farm-queue
tasks:
image:
repository: myregistry.com/example-repo/farm-queue
initImage:
repository: myregistry.com/example-repo/busybox
mysql:
image:
registry: myregistry.com
repository: example-repo/mysql
redis:
image:
registry: myregistry.com
repository: example-repo/redis
Then install using the custom values file containing the repo/registry overrides:
helm upgrade \
--install \
omniverse-farm \
omniverse-farm-1.0.0.tgz \
--create-namespace \
--namespace ov-farm \
-f /path/to/my-farm-values.yaml
The following section contains an outline of all values.
To view values via helm use helm show values
NOTE: Jobs API Key
The Jobs API key is needed for creating job definitions within farm.
To retrieve the API key, get the Jobs configmap and search for "api_key"
kubectl get cm omniverse-farm-jobs -o yaml -n ov-farm | grep api_key
NOTE: There are global values.
Key | Type | Default | Description |
---|---|---|---|
agents.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
agents.fullnameOverride | string | "" |
Full override .fullname template |
agents.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
agents.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
agents.image.tag | string | "104.0.2" |
Image tag. |
agents.imagePullSecrets | list | [] |
Image Pull Secrets. |
agents.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
agents.initImage.repository | string | "busybox" |
Init image repository. Must have netcat installed. |
agents.initImage.tag | string | "1.35" |
Init image tag. |
agents.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
agents.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
agents.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
agents.name | string | "agents" |
|
agents.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
agents.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
agents.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
agents.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
agents.replicaCount | int | 1 |
Number of replicas. |
agents.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
agents.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
agents.service.containerPort | int | 80 |
Container port. |
agents.service.name | string | "agents" |
Name of the service. |
agents.service.port | int | 80 |
Service port. |
agents.service.type | string | "ClusterIP" |
Kubernetes service type. |
agents.service.url_prefix | string | "/queue/management/agents" |
Url prefix for the service |
agents.serviceConfig | object | {} |
Configuration specific to this service. |
agents.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
controller.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
controller.fullnameOverride | string | "" |
Full override .fullname template |
controller.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
controller.image.repository | string | "nvcr.io/nvidia/omniverse/farm-agent-k8s" |
Image repository. |
controller.image.tag | string | "104.0.4" |
Image tag. |
controller.imagePullSecrets | list | [] |
Image Pull Secrets |
controller.initImage.repository | string | "busybox" |
Init image repository. Must have netcat installed. |
controller.initImage.tag | string | "1.35" |
Init image tag. |
controller.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
controller.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
controller.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
controller.name | string | "controller" |
|
controller.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
controller.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
controller.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
controller.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
controller.replicaCount | int | 1 |
Number of replicas. |
controller.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
controller.role.name | string | "controller-scheduler" |
|
controller.roleBinding.name | string | "controller-scheduler" |
|
controller.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
controller.service.containerPort | int | 80 |
Container port. |
controller.service.name | string | "controller" |
Name of the service. |
controller.service.port | int | 80 |
Service port. |
controller.service.type | string | "ClusterIP" |
Kubernetes service type. |
controller.serviceAccount.name | string | "controller-scheduler" |
|
controller.serviceConfig | object | {"capacity":{"max_capacity":32},"k8s":{"containerSpecOverrides":{},"jobTTLSecondsAfterFinished":600,"jobTemplateSpecOverrides":{}},"public_to_private_source_queue_hostnames":{}} |
Configuration specific to this service. |
controller.serviceConfig.capacity.max_capacity | int | 32 |
Specify the max number of jobs the controller is allowed to run. |
controller.serviceConfig.k8s.containerSpecOverrides | object | {} |
Specify Container spec overrides, these are fields under (spec.template.spec.containers) |
controller.serviceConfig.k8s.jobTTLSecondsAfterFinished | int | 600 |
Specify the Jobs' TTL Seconds After Finished https://kubernetes.io/docs/concepts/workloads/controllers/ttlafterfinished/#ttl-after-finished-controller |
controller.serviceConfig.k8s.jobTemplateSpecOverrides | object | {} |
Specify Job template spec overrides, these are fields under (spec.template.spec) |
controller.serviceConfig.public_to_private_source_queue_hostnames | object | {} |
Specify mapping from public facing DNS to internal DNS for tasks service. |
controller.services_base_url | string | "/queue/management" |
|
controller.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
dashboard.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
dashboard.fullnameOverride | string | "" |
Full override .fullname template |
dashboard.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
dashboard.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
dashboard.image.tag | string | "104.0.2" |
Image tag. |
dashboard.imagePullSecrets | list | [] |
Image Pull Secrets |
dashboard.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
dashboard.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
dashboard.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
dashboard.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
dashboard.name | string | "dashboard" |
|
dashboard.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
dashboard.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
dashboard.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
dashboard.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
dashboard.replicaCount | int | 1 |
Number of replicas. |
dashboard.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
dashboard.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
dashboard.service.containerPort | int | 80 |
Container port. |
dashboard.service.name | string | "dashboard" |
Name of the service. |
dashboard.service.port | int | 80 |
Service port. |
dashboard.service.type | string | "ClusterIP" |
Kubernetes service type. |
dashboard.service.url_prefix | string | "/queue/management/dashboard/" |
Url prefix for the service |
dashboard.serviceConfig | object | {} |
Configuration specific to this service. |
dashboard.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
global.imagePullSecrets | list | [] |
Global image pull secrets used within the services. NOTE: Job's created by the Controller do not use the global image pull secrets. to specify secrets for those pods use "jobTemplateSpecOverrides.imagePullSecrets" setting. |
global.ingress.annotations | object | {"kubernetes.io/ingress.class":"nginx"} |
Global Ingress annotations. |
global.ingress.host | string | "" |
Global Ingress host. |
global.ingress.paths | list | [] |
Global Ingress paths. |
global.ingress.tls | list | [] |
Global Ingress tls. |
global.skip_db_migrations | bool | false |
Specify whether to skip database migrations. By default migrations will run during install. |
global.transportHost | string | "0.0.0.0" |
Specify the services transport host. For IPv6 use "::". |
jobs.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
jobs.fullnameOverride | string | "" |
Full override .fullname template |
jobs.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
jobs.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
jobs.image.tag | string | "104.0.2" |
Image tag. |
jobs.imagePullSecrets | list | [] |
Image Pull Secrets |
jobs.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
jobs.initImage.repository | string | "busybox" |
Init image repository. Must have netcat installed. |
jobs.initImage.tag | string | "1.35" |
Init image tag. |
jobs.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
jobs.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
jobs.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
jobs.name | string | "jobs" |
|
jobs.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
jobs.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
jobs.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
jobs.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
jobs.replicaCount | int | 1 |
Number of replicas. |
jobs.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
jobs.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
jobs.service.containerPort | int | 80 |
Container port. |
jobs.service.name | string | "jobs" |
Name of the service. |
jobs.service.port | int | 80 |
Service port. |
jobs.service.type | string | "ClusterIP" |
Kubernetes service type. |
jobs.service.url_prefix | string | "/queue/management/jobs" |
Url prefix for the service |
jobs.serviceConfig | object | {"apiKey":null} |
Configuration specific to this service. |
jobs.serviceConfig.apiKey | string | nil |
Job's API key, if unspecified one will be automatically generated. |
jobs.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
logs.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
logs.fullnameOverride | string | "" |
Full override .fullname template |
logs.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
logs.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
logs.image.tag | string | "104.0.2" |
Image tag. |
logs.imagePullSecrets | list | [] |
Image Pull Secrets |
logs.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
logs.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
logs.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
logs.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
logs.name | string | "logs" |
|
logs.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
logs.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
logs.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
logs.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
logs.replicaCount | int | 1 |
Number of replicas. |
logs.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
logs.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
logs.service.containerPort | int | 80 |
Container port. |
logs.service.name | string | "logs" |
Name of the service. |
logs.service.port | int | 80 |
Service port. |
logs.service.type | string | "ClusterIP" |
Kubernetes service type. |
logs.service.url_prefix | string | "/queue/management/logs" |
Url prefix for the service |
logs.serviceConfig | object | {"logging_store_args":{},"logging_store_class":"omni.services.farm.logging.store.memory.MemoryLogStore"} |
Configuration specific to this service. |
logs.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
metrics.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
metrics.fullnameOverride | string | "" |
Full override .fullname template |
metrics.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
metrics.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
metrics.image.tag | string | "104.0.2" |
Image tag. |
metrics.imagePullSecrets | list | [] |
Image Pull Secrets |
metrics.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
metrics.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
metrics.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
metrics.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
metrics.name | string | "metrics" |
|
metrics.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
metrics.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
metrics.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
metrics.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
metrics.prometheus_url_prefix | string | "/queue/utilities/metrics/prometheus" |
|
metrics.replicaCount | int | 1 |
Number of replicas. |
metrics.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
metrics.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
metrics.service.containerPort | int | 80 |
Container port. |
metrics.service.name | string | "metrics" |
Name of the service. |
metrics.service.port | int | 80 |
Service port. |
metrics.service.type | string | "ClusterIP" |
Kubernetes service type. |
metrics.service.url_prefix | string | "/queue/management/metrics" |
Url prefix for the service |
metrics.serviceConfig | object | {} |
Configuration specific to this service. |
metrics.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
mysql.auth.database | string | "ovFarmTaskStore" |
|
mysql.auth.password | string | "ovfarm" |
|
mysql.auth.rootPassword | string | "ovfarm" |
|
mysql.auth.username | string | "ovfarm" |
|
mysql.enabled | bool | true |
|
mysql.fullnameOverride | string | "mysql" |
|
mysql.primary.configuration | string | "[mysqld]\ndefault_authentication_plugin=mysql_native_password\nskip-name-resolve\nexplicit_defaults_for_timestamp\nbasedir=/opt/bitnami/mysql\nplugin_dir=/opt/bitnami/mysql/lib/plugin\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\ndatadir=/bitnami/mysql/data\ntmpdir=/opt/bitnami/mysql/tmp\nmax_allowed_packet=16M\nbind-address=*\npid-file=/opt/bitnami/mysql/tmp/mysqld.pid\nlog-error=/opt/bitnami/mysql/logs/mysqld.log\ncharacter-set-server=UTF8\ncollation-server=utf8_general_ci\nslow_query_log=0\nslow_query_log_file=/opt/bitnami/mysql/logs/mysqld.log\nlong_query_time=10.0\n[client]\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\ndefault-character-set=UTF8\nplugin_dir=/opt/bitnami/mysql/lib/plugin\n[manager]\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\npid-file=/opt/bitnami/mysql/tmp/mysqld.pid" |
|
mysql.secondary.configuration | string | "[mysqld]\ndefault_authentication_plugin=mysql_native_password\nskip-name-resolve\nexplicit_defaults_for_timestamp\nbasedir=/opt/bitnami/mysql\nplugin_dir=/opt/bitnami/mysql/lib/plugin\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\ndatadir=/bitnami/mysql/data\ntmpdir=/opt/bitnami/mysql/tmp\nmax_allowed_packet=16M\nbind-address=*\npid-file=/opt/bitnami/mysql/tmp/mysqld.pid\nlog-error=/opt/bitnami/mysql/logs/mysqld.log\ncharacter-set-server=UTF8\ncollation-server=utf8_general_ci\nslow_query_log=0\nslow_query_log_file=/opt/bitnami/mysql/logs/mysqld.log\nlong_query_time=10.0\n[client]\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\ndefault-character-set=UTF8\nplugin_dir=/opt/bitnami/mysql/lib/plugin\n[manager]\nport=3306\nsocket=/opt/bitnami/mysql/tmp/mysql.sock\npid-file=/opt/bitnami/mysql/tmp/mysqld.pid" |
|
mysql.service.port | int | 3306 |
|
redis.architecture | string | "standalone" |
|
redis.auth.enabled | bool | false |
|
redis.auth.sentinel | bool | false |
|
redis.enabled | bool | true |
|
redis.fullnameOverride | string | "redis" |
|
redis.service.ports.redis | int | 6379 |
|
retries.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
retries.fullnameOverride | string | "" |
Full override .fullname template |
retries.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
retries.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
retries.image.tag | string | "104.0.2" |
Image tag. |
retries.imagePullSecrets | list | [] |
Image Pull Secrets |
retries.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
retries.initImage.repository | string | "busybox" |
Init image repository. Must have netcat installed. |
retries.initImage.tag | string | "1.35" |
Init image tag. |
retries.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
retries.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
retries.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
retries.name | string | "retries" |
|
retries.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
retries.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
retries.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
retries.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
retries.replicaCount | int | 1 |
Number of replicas. |
retries.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
retries.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
retries.service.containerPort | int | 80 |
Container port. |
retries.service.name | string | "retries" |
Name of the service. |
retries.service.port | int | 80 |
Service port. |
retries.service.type | string | "ClusterIP" |
Kubernetes service type. |
retries.service.url_prefix | string | "/queue/management/retries" |
Url prefix for the service |
retries.serviceConfig | object | {} |
Configuration specific to this service. |
retries.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
settings.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
settings.fullnameOverride | string | "" |
Full override .fullname template |
settings.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
settings.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
settings.image.tag | string | "104.0.2" |
Image tag. |
settings.imagePullSecrets | list | [] |
Image Pull Secrets |
settings.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
settings.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
settings.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
settings.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
settings.name | string | "settings" |
|
settings.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
settings.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
settings.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
settings.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
settings.replicaCount | int | 1 |
Number of replicas. |
settings.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
settings.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
settings.service.containerPort | int | 80 |
Container port. |
settings.service.name | string | "settings" |
Name of the service. |
settings.service.port | int | 80 |
Service port. |
settings.service.type | string | "ClusterIP" |
Kubernetes service type. |
settings.service.url_prefix | string | "/queue" |
Url prefix for the service |
settings.serviceConfig | object | {"exposed_settings":{"advanced_rendering_features":{}}} |
Configuration specific to this service. |
settings.serviceConfig.exposed_settings.advanced_rendering_features | object | {} |
Specify advanced rendering feature for the farm. |
settings.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
tasks.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
tasks.fullnameOverride | string | "" |
Full override .fullname template |
tasks.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
tasks.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
tasks.image.tag | string | "104.0.2" |
Image tag. |
tasks.imagePullSecrets | list | [] |
Image Pull Secrets |
tasks.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
tasks.initImage.repository | string | "busybox" |
Init image repository. Must have netcat installed. |
tasks.initImage.tag | string | "1.35" |
Init image tag. |
tasks.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
tasks.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
tasks.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
tasks.name | string | "tasks" |
|
tasks.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
tasks.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
tasks.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
tasks.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
tasks.replicaCount | int | 1 |
Number of replicas. |
tasks.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
tasks.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
tasks.service.containerPort | int | 80 |
Container port. |
tasks.service.name | string | "tasks" |
Name of the service. |
tasks.service.port | int | 80 |
Service port. |
tasks.service.type | string | "ClusterIP" |
Kubernetes service type. |
tasks.service.url_prefix | string | "/queue/management/tasks" |
Url prefix for the service |
tasks.serviceConfig | object | {} |
Configuration specific to this service. |
tasks.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
ui.affinity | object | {} |
Affinity for pod assignment. https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity |
ui.enabled | bool | true |
|
ui.fullnameOverride | string | "" |
Full override .fullname template |
ui.image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
ui.image.repository | string | "nvcr.io/nvidia/omniverse/farm-queue" |
Image repository. |
ui.image.tag | string | "104.0.2" |
Image tag. |
ui.imagePullSecrets | list | [] |
Image Pull Secrets |
ui.ingress.enabled | bool | true |
Enables the creation of Ingress resource. |
ui.logLevel | string | "Info" |
Log level for the application (valid levels; Info, Debug, Verbose, Warning, Error) |
ui.monitoring.enabled | bool | false |
Enables the creation of ServiceMonitor resource. |
ui.monitoring.prometheusNamespace | string | "monitoring" |
Prometheus namespace. |
ui.name | string | "ui" |
|
ui.nameOverride | string | "" |
Partially override .fullname template (maintains the release name) |
ui.nodeSelector | object | {} |
Node labels for pod assignment. https://kubernetes.io/docs/user-guide/node-selection/ |
ui.podAnnotations | object | {} |
Pod annotations. https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ |
ui.podSecurityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
ui.replicaCount | int | 1 |
Number of replicas. |
ui.resources | object | {} |
Container resource requests and limits. https://kubernetes.io/docs/user-guide/compute-resources/ |
ui.securityContext | object | {} |
Security Context. https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod |
ui.service.containerPort | int | 80 |
Container port. |
ui.service.name | string | "ui" |
Name of the service. |
ui.service.port | int | 80 |
Service port. |
ui.service.type | string | "ClusterIP" |
Kubernetes service type. |
ui.service.url_prefix | string | "/queue/management/ui" |
Url prefix for the service |
ui.serviceConfig | object | {} |
Configuration specific to this service. |
ui.tolerations | list | [] |
Tolerations for pod assignment. https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ |
Use of Omniverse Farm is covered under the NVIDIA Omniverse License Agreement.