Welcome to USD Search API Helm Chart!
Homepage: https://docs.omniverse.nvidia.com/services/latest/services/usd-search/overview.html
Repository | Name | Version |
---|---|---|
asset-graph-service | 0.1.0 | |
deepsearch | 1.0.0 | |
deepsearch-crawler | 1.0.0 | |
ngsearch | 1.0.0 | |
https://charts.bitnami.com/bitnami | redis | 19.0.2 |
https://helm.neo4j.com/neo4j | neo4j | 5.23 |
https://opensearch-project.github.io/helm-charts/ | opensearch | 2.26.1 |
oci://registry-1.docker.io/bitnamicharts | deepsearch-explorer(nginx) | 18.2.0 |
oci://registry-1.docker.io/bitnamicharts | api-gateway(nginx) | 18.2.0 |
This chart assumes that a Kubernetes cluster is already available and configured.
Depending on the ingress controller and namespace, some changes may be required.
Kubernetes cluster with the following features enabled:
Optional:
Helm version is higher than 3.0.0
. To check this run the following command
$ helm version
version.BuildInfo{Version:"v3.2.0", GitCommit:"e11b7ce3b12db2941e90399e874513fbd24bcb71", GitTreeState:"clean", GoVersion:"go1.13.10"}
Generate your NGC helm and container registry API Key prior to fetch helm chart. See onboarding guide here.
Fetch the latest helm chart from the registry
$ helm fetch https://helm.ngc.nvidia.com/nvidia/usdsearch/charts/usdsearch-1.0.0.tgz --username='$oauthtoken' --password=<YOUR API KEY>
USD Search can be deployed to index data from an AWS S3 bucket or an Omniverse Nucleus server. Detailed deployment commands can be found in the sections below.
To deploy the USD Search Helm chart and connect it to an AWS S3 bucket for indexing, run the following command:
helm install <deployment name> usdsearch-1.0.0.tgz \
--set global.accept_eula=true \
--set global.storage_backend_type=s3 \
--set global.s3.bucket_name=<S3 bucket name> \
--set global.s3.region_name=<S3 bucket region> \
--set global.s3.aws_access_key_id=<AWS access key ID> \
--set global.s3.aws_secret_access_key=<AWS secret access key> \
--set global.secrets.create.auth=true \
--set global.secrets.create.registry=true \
--set global.ngcAPIKey=<NGC API KEY> \
--set api-gateway.image.pullSecrets={nvcr.io}
NOTE: Please refer to NGC documentation for generating the NGC API Key.
This command will create all required secrets on helm chart deployment, therefore subsequent installations / updates of the helm chart do not require providing credentials information and can be done as follows:
helm upgrade <deployment name> usdsearch-1.0.0.tgz --install \
--set global.accept_eula=true \
--set global.storage_backend_type=s3 \
--set global.s3.bucket_name=<S3 bucket name> \
--set global.s3.region_name=<S3 bucket region> \
--set api-gateway.image.pullSecrets={nvcr.io}
USD Search re-scans the S3 bucket daily to detect new data while minimizing load. You can adjust the re-scan frequency by adding the following argument to the installation or upgrade command:
--set global.s3.re_scan_timeout=<re scan timeout time in seconds>
NOTE: Setting this parameter too low may cause processing queues to grow faster than your USD Search instance can handle. It is recommended to monitor the processing queues in Grafana and adjust the parameter to prevent excessive queue growth.
USD Search service requires a service account with administrator rights in order to index the content stored on the Nucleus server. It is possible to use the main Nucleus service account, generated during the Nucleus service installation time. Alternatively, it is possible to create a dedicated service account for USD Search. For the exact steps on how such account could be created Please follow this guide.
When creating your own service account it is required to grant it admin access. This guide explains this process in detail:
After the service account's username
and password
are obtained, it is possible to deploy the USD Search Helm chart and connect it to an Omniverse Nucleus server for indexing, as follows:
helm install <deployment name> usdsearch-1.0.0.tgz \
--set global.accept_eula=true \
--set global.storage_backend_type=nucleus \
--set global.nucleus.server=<Omniverse Nucleus server hostname or IP> \
--set global.nucleus.username=<Omniverse service account name> \
--set global.nucleus.password=<Omniverse service account password> \
--set global.secrets.create.auth=true \
--set global.secrets.create.registry=true \
--set global.ngcAPIKey=<NGC API KEY> \
--set api-gateway.image.pullSecrets={nvcr.io} \
--set deepsearch-crawler.crawler.extraConfig.exclude_patterns={"omniverse://[^\/]+/NVIDIA.*"}
NOTE: Please refer to NGC documentation for generating the NGC API Key.
NOTE: The /NVIDIA
sample data mount is excluded from indexing by default. To enable its indexing, remove the corresponding line from the command. You can also customize URL patterns for indexing, as explained in the Indexing path filtering section.
This command will create all required secrets on helm chart deployment, therefore subsequent installations / updates of the helm chart do not require providing credentials information and can be done as follows:
helm upgrade <deployment name> usdsearch-1.0.0.tgz --install \
--set global.accept_eula=true \
--set global.storage_backend_type=nucleus \
--set global.nucleus.server=<Omniverse Nucleus server hostname or IP> \
--set api-gateway.image.pullSecrets={nvcr.io} \
--set deepsearch-crawler.crawler.extraConfig.exclude_patterns={"omniverse://[^\/]+/NVIDIA.*"}
By default, when connected to an Omniverse Nucleus server, search service verifies that the user has access to retrieved assets before returning them. It is, therefore, required for the user to provide either username
/ password
pair or a Nucleus authentication token in order to access search functionality.
At service deploy time it is possible to configure admin access key that allows by-passing this access verification step by setting the access_key
configuration parameter as follows:
--set ngsearch.microservices.search_rest_api.admin_authentication.access_key=<some key value>
when this field is left unset the access key will be auto-generated and stored in a configmap on the kubernetes cluster. To retrieve the value of this key you can run the following:
$ export NAMESPACE=<deployment namespace>
$ export HELM_NAME=<deployment name>
$ echo $(kubectl get cm $HELM_NAME-ngsearch-env-config -n $NAMESPACE -o "jsonpath={.data.DEEPSEARCH_BACKEND_ADMIN_ACCESS_KEY}")
Alternatively it is possible to run the following command:
helm status <deployment name>
which will show some information about the deployment and the list of useful commands. The value of Admin access key will be printed there as part of Accessing USD Search
section.
If access verification is not desired it is possible to disable it by providing the following argument to the installation or upgrade command:
--set ngsearch.microservices.search_rest_api.enable_access_verification=false
All the USD Search functionality is unified under a single API gateway. API Gateway service configuration. By default the ClusterIP service type is used, however, it is possible to override this and make the endpoint publicly accessible. Please refer to NGINX helm chart service configuration for a complete list of settings.
Below you can find several sample configurations for the API gateway endpoint, depending on the desired service type, and instructions on how to access it:
ClusterIP (default) - If the ClusterIP service type is used, you can access the API externally via port-forwarding from the Kubernetes cluster as described below. For internal access, use the service name directly.
API Gateway port-forward example:
$ kubectl port-forward -n <NAMESPACE> svc/<deployment name>-api-gateway 8080:80
The endpoint should then be accessible at http://localhost:8080
.
NodePort - You can specify the NodePort service type to make the endpoint publicly accessible using the Kubernetes cluster’s external IP. Typically, NodePort values must be in the >30000 range. To enable this, use the following command line arguments:
--set api-gateway.service.type=NodePort \
--set api-gateway.service.nodePorts.http=<NodePort value>
the service will then be accessible at http://<external cluster IP>:<NodePort value>
NOTE: Please refer to Kubernetes documentation more information about NodePort type services.
Ingress - You can enable Ingress to make the service available at a specified hostname. This requires an Ingress controller to be set up, and you may need to specify an Ingress class that matches your controller. Use the following command line arguments to configure it:
--set api-gateway.service.ingress.enabled=true \
--set api-gateway.service.ingress.hostname=<service hostname> \
--set api-gateway.service.ingress.ingressClassName=<ingress class name, e.g. 'nginx'>
the service will then be accessible at http://<service hostname>
The USD Search Helm chart includes monitoring functionality that sets up metrics collection and pre-configured dashboards for tracking background indexing progress and the overall system state. This requires the Prometheus Operator and Grafana with dashboard provisioning. You can use the Kube Prometheus Stack to meet these requirements. To enable monitoring, provide the following command line arguments:
--set deployGrafanaDashboards=true \
--set deployServiceMonitors=true
With the above flag enabled the metrics will be automatically scraped by Prometheus service (part of Prometheus stack) and visualized in Grafana as USD Search / Plugin Processing Dashboard
and USD Search / Metadata Indexing and Crawler Dashboard
dashboards.
The first installation of deployment may take some time, as all the containers will be pulled from the registry. Subsequent installations, however, will be faster.
In order to verify that deployment is successful and the services are working as expected it is possible to run the following
helm test <deployment name>
For a successful deployment this should print the following to standard output:
...
TEST SUITE: <deployment name>-ags-api-verification
Last Started: Fri Nov 8 12:01:47 2024
Last Completed: Fri Nov 8 12:01:52 2024
Phase: Succeeded
TEST SUITE: <deployment name>-database-search-api-verification
Last Started: Fri Nov 8 12:01:53 2024
Last Completed: Fri Nov 8 12:01:59 2024
Phase: Succeeded
TEST SUITE: <deployment name>-s3-storage-connection-verification
Last Started: Fri Nov 8 12:01:59 2024
Last Completed: Fri Nov 8 12:02:10 2024
Phase: Succeeded
...
In case some of these test return a Failed
status, please refer to the following resources for debugging the deployment:
NOTE: Changing the settings provided below is typically not required for a standard installation of USD Search. These settings, however, allow customizing the deployment and tuning it for the particular use-case and infrastructure.
Below several additional parameters that allow customizing USD Search service are described. For convenience, instead of providing them all as command line arguments, it is recommended to pass them to the installation command as a configuration file as follows:
helm install .... -f my-usdsearch-config.yaml
where my-usdsearch-config.yaml
file can have the following additional settings.
By default USD Search indexes the all the assets that can be found on the server. It is, however, possible to explicitly define which URL patterns should be included in indexing and which should be excluded. The URL definition supports python regex syntax. In order to include / exclude some of the file patterns - it is possible to specify the following in the my-usdsearch-config.yaml
file:
deepsearch-crawler:
crawler:
extraConfig:
include_patterns:
- <.*regexp of the folder that needs to be included.*>
exclude_patterns:
- <.*regexp of the folder that needs to be excluded.*>
For rendering USD asset USD Search API rely on rendering jobs that are scheduled by Kubernetes on demand. If needed - rendering Job configuration could be adjusted.
Resource requests and limits for each individual rendering job could be adjusted by updating the resources
parameter in the deepsearch.microservices.k8s_renderer
section in my-usdsearch-config.yaml
file. The default settings for resource requests and limits for the job are outlined below:
requests:
memory: "30Gi"
cpu: "4"
nvidia.com/gpu: "1"
limits:
memory: "30Gi"
cpu: "11"
nvidia.com/gpu: "1"
Rendering job is using Omniverse Kit to render USD assets. When doing rendering Omniverse Kit saves some information in cache to make processing more efficient. This information includes shader cache, which takes between 100 and 300 seconds to compute. By default cache it created as a memory volume that is preserved during the lifetime of the job and is removed after it has completed.
In order to optimize processing it is, however, possible to persist this cached information, so that subsequent runs of the job are faster. In order to achieve
this it is required to appropriately configure deepsearch.persistence
section.
For each of these cache locations there are several set-up options that are controlled by setting the type of cache:
NOTE: This is an experimental feature. Some functionality may not be supported in case of the heterogenous GPU setup.
For convenience, configurable values of this Helm Chart are outlined in the sections below.
Key | Type | Default | Description |
---|---|---|---|
global.accept_eula |
bool
|
false
|
Set the following to true to indicate your acceptance of EULA. |
global.checks |
object
|
{
"ags": {
"enabled": true
},
"rest_api": {
"enabled": true
},
"storage_backend": {
"enabled": true
}
}
|
Configuration of helm hooks that are used for testing storage backend availability and API readiness. |
global.dnsConfig |
map
|
searches: []
|
Additional optional DNS configuration parameters can be passed using the following parameter: |
global.embedding_deployment |
map
|
enabled: true
type: "triton_server"
endpoint: ""
|
Embedding service instance is deployed with USD Search API Helm chart by default. It can be disabled, but in that case an alternative endpoint must be provided. USD Search supports various options for embedding model:
|
global.enable_structured_logging |
bool
|
false
|
Enable structured logging that could then be collected from container standard output and forward to any system for keeping log data |
global.imagePullSecrets |
map
|
null
|
Kubernetes secret that stores authentication information for pulling images from the registry. |
global.ngcAPIKey |
string
|
null
|
It is possible to provide NGC API Key on the first deployment of the helm chart, such that the appropriate docker registry pull secret is created. |
global.ngcAPIKeySecretName |
string
|
"ngc-api"
|
It is possible to provide NGC API Key on the first deployment of the helm chart, such that the appropriate docker registry pull secret is created. |
global.ngcImagePullSecretName |
string
|
"nvcr.io"
|
As an alternative to the |
global.nodeIP |
string
|
""
|
Please specify (preferably) the hostname or the IP of the Kubernetes cluster node, where USD Search API helm chart is running. This address will be used for service registration when using NodePort services. |
global.registry |
string
|
"nvcr.io/nvidia/usdsearch"
|
Container Registry root URL |
global.secrets |
object
|
{
"annotations": {
"helm.sh/resource-policy": "keep"
},
"create": {
"auth": false,
"ngc_api": false,
"registry": false
}
}
|
Auto-generated secrets configuration. By setting the respective field in the
|
global.storage_backend_type |
string
|
"s3"
|
Set the desired storage backend type. Supported options are the following:
|
global.tracing |
yaml
|
OTEL_SDK_DISABLED: "true"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://tempo:4318"
|
OpenTelemetry tracing configuration. Is it optionally possible to enable trace collection for USD Search REST API and AGS
services. By default it is disabled. In order to enable, please set |
The following parameters describe Omniverse Kit-based rendering Job configuration
Key | Type | Default | Description |
---|---|---|---|
deepsearch.microservices.k8s_renderer.activeDeadlineSeconds |
int
|
3600
|
Maximum amount of time that is allocated for job execution |
deepsearch.microservices.k8s_renderer.persistence.initVolumePermissions |
bool
|
false
|
Depending on the volume, where shared cache is stored it may be necessary to update permissions of the
folder, where cache data is written. To enable this functionality, please set this setting to |
deepsearch.microservices.k8s_renderer.render_job_pod_annotations |
map
|
map[]
|
Additional optional annotations of the Rendering Job pod |
deepsearch.microservices.k8s_renderer.resources |
map
|
requests:
memory: "30Gi"
cpu: "4"
nvidia.com/gpu: "1"
limits:
memory: "30Gi"
cpu: "11"
nvidia.com/gpu: "1"
|
Resources that are allocated for single job execution NOTE: Rendering job requires RTX GPU, which can not be requested as a resource. If your cluster has both RTX
and non-RTX GPUs, it is recommended to use an additional method to select the node with RTX GPU, e.g. by
setting appropriate taints on RTX and non-RTX GPUs and adding a toleration in
|
deepsearch.microservices.k8s_renderer.ttlSecondsAfterFinished |
int
|
3600
|
Amount of time pod is kept after completion |
USD Search API service is built as a collection of plugins, each of which does a specific job. These plugins can be optionally activated and deactivated by modifying the following configuration and setting the active
parameter of specific plugins to either True
or False
.
Each of the plugins is deployed as a separate k8s deployment that could be horizontally scaled. By default this functionality is disabled to avoid occupying to many resources. In order to enable it, one could add the following to individual plugin configuration.
hpa:
enabled: true
minReplicas: <desired minimum number of replicas (default 1) >
maxReplicas: <desired maximum number of replicas (default 1) >
targetCPUUtilizationPercentage: <desired target CPU utilization (default 80) >
for example for image_to_embedding
plugin the configuration can look like this:
deepsearch:
# ...
plugins:
# ...
image_to_embedding:
hpa:
enabled: true
minReplicas: 1
maxReplicas: 5
targetCPUUtilizationPercentage: 80
Some plugins may rely on external services and do not do a lot of processing themselves. In order to increase throughput it is possible to increase concurrency for each plugin by setting n_concurrent_queue_workers
parameter to the desired value. For those plugins that require USD Asset rendering this value is set to 256
by default, others have this parameter set to 1
.
Adjusting the value of this parameter may increase throughput and can be done as in the following example:
deepsearch:
# ...
plugins:
# ...
thumbnail_to_embedding:
n_concurrent_queue_workers: 8
The full list of plugins with descriptions can be found below:
Key | Type | Default | Description |
---|---|---|---|
deepsearch.plugins.asset_graph_generation |
map
|
active: True
|
Construction of a scene graph for a given USD asset and adding it to the Asset Graph Service (AGS) index. |
deepsearch.plugins.image_to_embedding |
map
|
active: True
|
Extraction of CLIP embeddings from images. The following image types are supported:
|
deepsearch.plugins.image_to_vision_metadata |
map
|
active: False
|
Generation of Vision metadata from Images. The following image types are supported:
|
deepsearch.plugins.rendering_to_embedding |
map
|
active: True
n_concurrent_queue_workers: 256
|
Rendering USD assets from multiple views and extraction of CLIP embeddings from each view to better characterize the asset. |
deepsearch.plugins.rendering_to_vision_metadata |
map
|
active: False
n_concurrent_queue_workers: 256
|
Generation of Vision metadata from multi-view renderings of 3D assets that are found on the Storage backend. |
deepsearch.plugins.thumbnail_generation |
map
|
active: True
n_concurrent_queue_workers: 256
|
Automatic generation of thumbnails for USD assets found on the storage backend based on the rendered views received from the Rendering service. |
deepsearch.plugins.thumbnail_to_embedding |
map
|
active: True
|
Extraction of CLIP embeddings from thumbnails of assets that are found on the Storage backend. |
The following parameters describe embedding service configuration settings:
Key | Type | Default | Description |
---|---|---|---|
deepsearch.microservices.embedding.replicas |
int
|
1
|
Number of replicas of the embedding service. |
deepsearch.microservices.embedding.resources |
map
|
requests:
nvidia.com/gpu: 1
cpu: 2
memory: 7Gi
limits:
nvidia.com/gpu: 1
cpu: 4
memory: 15Gi
|
Resources that are allocated for embedding deployment NOTE: Embedding service could rely on either CPU or GPU. Using GPU, however, significantly speeds up inference. |
deepsearch.microservices.embedding.tmpDir |
map
|
emptyDir:
medium: Memory
sizeLimit: 256Mi
|
Configuration of the temporary directory for the service. By default, it is set to use Memory medium. |
deepsearch.microservices.embedding.tolerations |
map
|
{}
|
Embedding service tolerations. |
The following parameters allow configuring the Asset Graph Search (AGS) deployment:
Key | Type | Default | Description |
---|---|---|---|
asset-graph-service.graphdb |
map
|
{
"n_workers": 5
}
|
Settings for interaction with Neo4j backend. |
asset-graph-service.graphdb.n_workers |
map
|
5
|
Number of parallel workers that would be writing data to Neo4j. Increasing this number results in faster processing, but will in turn require scaling the Neo4j instance. |
asset-graph-service.sentry_dsn |
string
|
""
|
Sentry Data Source Name. By default this field is unset, however it is possible to configure it to an approriate DSN value to be able to collect events that are associated with the AGS deployment. |
asset_graph_service_deployment.enabled |
bool
|
true
|
trigger to enable Asset Graph Service (AGS) helm chart deployment |
Key | Type | Default | Description |
---|---|---|---|
asset_graph_service_deployment.enabled |
bool
|
true
|
trigger to enable Asset Graph Service (AGS) helm chart deployment |
deepsearch-crawler.resources |
map
|
requests:
cpu: 100m
memory: 128Mi
|
Default USD Search Crawler resource requests and limits |
deepsearch.microservices.monitor |
map
|
{
"replicas": 1
}
|
Configuration of the Monitor service that runs in the background and indexes data on the storage backend |
deepsearch.microservices.omni_writer |
map
|
replicas: 1
|
Configuration of the Writer service |
deepsearch.vision_endpoint |
map
|
{
"azure_openai": {
"api_key_secret_field": "api-key",
"api_key_secret_name": "azure-openai-vlm-api-key-secret"
},
"nvidia_ai_endpoints": {
"api_key_secret_field": "api-key",
"api_key_secret_name": "nvidia-ai-endpoints-vlm-api-key-secret"
},
"openai": {
"api_key_secret_field": "api-key",
"api_key_secret_name": "openai-vlm-api-key-secret",
"model": "gpt-4o"
},
"vlm_service": ""
}
|
(Beta) Vision Language Model (VLM) configuration functionality settings |
deepsearch.vision_endpoint.vlm_service |
string
|
""
|
(beta) Currently supported VLM type options are:
|
deepsearch_feature |
map
|
enabled: true
|
It is possible to disable the AI USD Search functionality and run only core NGSearch services. |
ngsearch.microservices.indexing |
map
|
{
"replicas": 1
}
|
Storage backend indexing configuration |
ngsearch.microservices.search_rest_api.default_search_size |
int
|
64
|
Number of search results that are returned by the NGSearch Search Service by
default, when using non-paginated search functionality. This value can be
overridden from the input search query using the |
ngsearch.microservices.search_rest_api.enable_access_verification |
bool
|
true
|
In order to verify that client application has access to view certain assets all search results are verified with the Storage backend. While this functionality is crucial for Omniverse Nucleus servers with fine-grained access. It may not be required for AWS S3 bucket or in the cases when all users have access to all the assets on the storage backend. In that case it is possible to switch off this functionality by setting the following parameter to false, which would also decrease the time for processing search request. NOTE: This functionality checks both the permissions and existence of the asset. If immediate reflection of asset deletions in the API is required, please enable this functionality. |
ngsearch.microservices.search_rest_api.search_telemetry_stdout |
bool
|
false
|
If Telemetry logging is enabled, there is additional possibility to log
telemetry information to stdout as structured logs, which can then accumulated
in a database for analysis. In order to enabled this functionality, set the
following parameter to |
ngsearch.microservices.search_rest_api.use_search_telemetry |
bool
|
false
|
Search service has the possibility to gather telemetry information about searches executed by the users of the system. This information can then be used to understand what queries are most frequently executed and also allow to track down issues if inconsistency appears in the search results. Telemetry information includes:
NOTE: that no information about the user is stored in the system. By default telemetry logging is switched off. If you wish to
enable it, set the parameter below to |
ngsearch.microservices.storage |
map
|
{
"replicas": 1
}
|
Storage service configuration |
ngsearch.microservices.storage_cron |
map
|
{
"replicas": 1
}
|
Storage Cron Job configuration |
ngsearch.microservices.tagcrawler |
map
|
{
"replicas": 1
}
|
Storage backend tag-crawler configuration |
Configuration for the default Redis instance, which gets deployed when redis_deployment.enabled
is set to true
.
Key | Type | Default | Description |
---|---|---|---|
redis.architecture |
string
|
"standalone"
|
Redis architecture type |
redis.auth |
map
|
enabled: False
|
Redis authentication |
redis.commonConfiguration |
tpl/array
|
redis.commonConfiguration: |
appendonly yes
save ""
databases 32
|
Redis common configuration |
redis.master |
map
|
disableCommands: []
persistence:
enabled: True
size: 64Gi
resources:
limits:
memory: 10Gi
ephemeral-storage: 10Gi
cpu: 1000m
|
Redis master configuration |
redis.replica |
map
|
replicaCount: 0
|
Redis additional replica count |
redis_deployment |
map
|
enabled: true
|
A Redis instance can be deployed as part of USD Search API helm chart.
Set |
Configuration for the default OpenSearch cluster, which gets deployed when opensearch_deployment.enabled
is set to true
.
Key | Type | Default | Description |
---|---|---|---|
opensearch.clusterName |
string
|
"deepsearch-opensearch-cluster"
|
Default opensearch cluster name |
opensearch.config."opensearch.yml" |
tpl/array
|
opensearch.config."opensearch.yml": |
network.host: 0.0.0.0
#knn.algo_param.index_thread_qty: 8
plugins:
security:
disabled: true
|
Default opensearch deployment configuration |
opensearch.extraEnvs[0].name |
string
|
"DISABLE_INSTALL_DEMO_CONFIG"
|
|
opensearch.extraEnvs[0].value |
string
|
"true"
|
|
opensearch.masterService |
string
|
"deepsearch-opensearch-cluster-master"
|
Default opensearch master service name |
opensearch.opensearchJavaOpts |
string
|
"-Xmx2048M -Xms2048M"
|
Default opensearch Java options |
opensearch.persistence |
map
|
enabled: true
size: 100Gi
|
Default opensearch persistent configuration |
opensearch.replicas |
int
|
3
|
Default number of OpenSearch replicas. The larger is the number of replicas - the higher is availability of the service (that is more search requests can be processed in parallel) however, as a drawback - more resources will be occupied. |
opensearch.resources |
map
|
requests:
cpu: "100m"
memory: "4Gi"
|
Default opensearch resource requests per replica |
opensearch.sysctl |
map
|
enabled: false
|
Set optimal sysctl's through securityContext. This requires privilege. Can be disabled if the system has already been pre-configured. (Ex: https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html) Also see: https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/ |
opensearch.sysctlInit |
map
|
{
"enabled": false
}
|
Set optimal sysctl's through privileged initContainer. |
opensearch_deployment |
map
|
enabled: true
|
An OpenSearch instance can be deployed as part of the USD Search API helm chart.
Set |
Configuration for the default Neo4j instance, which gets deployed when neo4j_deployment.enabled
is set to true
.
Key | Type | Default | Description |
---|---|---|---|
neo4j.config |
map
|
server.config.strict_validation.enabled: "false"
server.memory.heap.initial_size: "8000m"
server.memory.heap.max_size: "8000m"
|
Neo4j configuration settings |
neo4j.env.NEO4J_PLUGINS |
string
|
"[\"graph-data-science\", \"apoc\"]"
|
Neo4j plugins configuration |
neo4j.fullnameOverride |
string
|
"neo4j"
|
Name of the Neo4j instance |
neo4j.neo4j |
map
|
name: neo4j
password: "password"
resources:
requests:
cpu: "4000m"
memory: "14Gi"
limits:
cpu: "4000m"
memory: "14Gi"
|
Neo4j authentication and resource settings |
neo4j.serviceMonitor.enabled |
bool
|
false
|
|
neo4j.services |
map
|
neo4j:
enabled: true
annotations: {}
spec:
type: ClusterIP
|
Neo4j service settings |
neo4j.volumes.data.defaultStorageClass.accessModes[0] |
string
|
"ReadWriteOnce"
|
|
neo4j.volumes.data.defaultStorageClass.requests.storage |
string
|
"100Gi"
|
|
neo4j.volumes.data.mode |
string
|
"defaultStorageClass"
|
REQUIRED: specify a volume mode to use for data Valid values are:
To get up-and-running quickly, for development or
testing, use |
neo4j_deployment.enabled |
bool
|
true
|
trigger to enable Neo4j helm chart deployment |
Name | Url | |
---|---|---|
NVIDIA | https://www.nvidia.com/en-us/ |
GOVERNING TERMS:
If you download the software and materials as available from the NVIDIA AI product portfolio, use is governed by the NVIDIA Software License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and the Product-Specific Terms for NVIDIA AI Products (found at https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/); except for the model which is governed by the NVIDIA AI Foundation Models Community License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-ai-foundation-models-community-license-agreement/.
If you download the software and materials as available from the NVIDIA Omniverse portfolio, use is governed by the NVIDIA Software License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and the Product-Specific Terms for NVIDIA Omniverse (found at NVIDIA Agreements | Enterprise Software | Product Specific Terms for Omniverse); except for the model which is governed by the NVIDIA AI Foundation Models Community License Agreement (found at https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-ai-foundation-models-community-license-agreement/.
USD Search API relies on Redis for all internal caching and uses the official Redis Helm chart for installation, which itself relies on the Persistent Volume Claim mechanism. For the Redis installation provided with USD Search API, a Persistent Volume is required that would be then claimed by Redis. An example configuration is shown below for a Persistent Volume that uses storage on the local file system.
apiVersion: v1
kind: PersistentVolume
metadata:
name: sample-pv-name
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 100Gi
local:
path: /var/lib/omni/volumes/001
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "node-name"
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
For more information on different types of Persistent Volumes and their setup procedures, please refer to the official Kubernetes documentation.
In rare cases it could happen that Redis appendonly file ends up in a corrupted state. In this case the following line will be printed in the logs of the redis pod:
Bad file format reading the append only file: make a backup of your AOF file, then use ./redis-check-aof --fix ....
This may happen if the node that is running redis got unexpectedly terminated or has run out of space.
In order to fix this issue, please execute the following set of commands.
NOTE: The steps below assume that there is only a single instance of USD Search API installed in a provided namespace. If that is not the case the way REDIS_STATEFULSET_NAME
and REDIS_POD_NAME
are computed need to be updated.
# prepare some settings
export NAMESPACE=<namespace where USD Search API is running>
export REDIS_AOF_FILE_NAME=<corrupted file name from the Redis log>
# get the name of the statefulset that is controlling Redis
export REDIS_STATEFULSET_NAME=$(kubectl get statefulset -n $NAMESPACE -o custom-columns=":metadata.name" | grep redis)
# get the name of the pod running Redis
export REDIS_POD_NAME=$(kubectl get pods -n $NAMESPACE -o custom-columns=":metadata.name" | grep redis)
# patch Redis statefulset to sleep (to exit crashbackloop)
kubectl patch statefulset -n $NAMESPACE $REDIS_STATEFULSET_NAME -p '{"spec": {"template": {"spec":{"containers":[{"name": "redis","args": ["-c", "sleep 1000000000"]}]}}}}'
# give k8s 5 seconds to restart redis
sleep 5
# fix corrupted redis file
kubectl exec -it -n $NAMESPACE $REDIS_POD_NAME -- redis-check-aof --fix /data/appendonlydir/$REDIS_AOF_FILE_NAME
# revert statefulset patching
kubectl patch statefulset -n $NAMESPACE $REDIS_STATEFULSET_NAME -p '{"spec": {"template": {"spec":{"containers":[{"name": "redis","args": ["-c", "/opt/bitnami/scripts/start-scripts/start-master.sh"]}]}}}}'
# delete running container to make sure it restarts
kubectl delete pods $REDIS_POD_NAME
When using the instance of OpenSearch provided with the USD Search API helm chart, the official OpenSearch Helm chart will be used for installation. This Helm chart by default installs a 3-Node OpenSearch instance and requires Persistent Volume storage. Creation of Persistent Volumes can be done in the exactly same way as described in the previous section.
When configuring USD Search API, a failure to register with the Nucleus Discovery service will happen if the provided Nucleus registration token is incorrect. If this occurs, the following error may be displayed:
Deployment: internal registration failed: DENIED
To solve this issue, the correct service registration token needs to be provided and can be located in the following subfolder within the Nucleus Docker Compose installation location:
base_stack/secrets/svc_reg_token
On some systems, the value of the kernel parameter vm.max_map_count
may be too low for OpenSearch. If this is the case, it is required to update the default value for vm.max_map_count
to at least 262144
, as described in the OpenSearch installation documentation.
To check the current value, run this command:
cat /proc/sys/vm/max_map_count
To increase the value, add the following line to /etc/sysctl.conf
:
vm.max_map_count=262144
Then run the following to reload and apply the settings change.
sudo sysctl -p
Helm chart installation assumes that storage backend (AWS S3 bucket or Omniverse Nucleus Server) is available before installation and valid credential information is provided.
For convenience we have included a helm pre-installation hook that checks the backend connection before installing of the helm chart.
If storage backend is not available, then depending on the backend type, one of the following errors will be printed during execution of helm install command:
Error: INSTALLATION FAILED: failed pre-install: 1 error occurred:
* job test-nucleus-storage-connection-verification failed: BackoffLimitExceeded
or
Error: INSTALLATION FAILED: failed pre-install: 1 error occurred:
* job test-s3-storage-connection-verification failed: BackoffLimitExceeded
It could happen that connection with the storage backend is broken after helm chart is installed. This could occur if the storage backend is unreachable for some reason. In this case, you may notice that many pods enter the CrashLoopBackOff
state. To confirm that the issue is indeed related to the storage backend connection, you can do one of the following:
run helm test which will verify storage backend connection as follows:
helm test <deployment name> --hide-notes
check the logs of any pod that entered CrashLoopBackOff
and if you see ConnectionError messages - that would mean that storage backend is for some reason unavailable.
In case slow search speeds are encountered, it is possible to do several optimizations from the helm chart level.
By default embedding service that is running NVCLIP model is relies on CPU for inference. It is however possible to make it use GPU, which significantly speeds up inference time. Please refer to Embedding service configuration section.
If the cluster permits - it is possible to increase the number of OpenSearch replicas that is used. By default the helm chart is set to use 3
replicas, which we found to be sufficient in our experiments, however, this parameter could be overwritten. Therefore, it is recommended to check opensearch.replicas
setting in my-usdsearch-config.yaml
and adjust it according the amount of available resources. Alternatively it is possible to set the desired number of OpenSearch replicas as a command line argument as follows:
--set opensearch.replicas=<desired number of OpenSearc replicas>
In case slow indexing speeds are encountered, it is possible to do several optimizations from the helm chart level.
By default Rendering Jobs are using memory medium for shader cache, which is only available during the lifetime of a job and therefore such cache needs to be re-calculated for each rendering job, which adds a significant overhead. Please refer to Rendering Job configuration section for more information on how to setup persistence.
Rendering jobs only get allocated, when enough resources are available on the cluster. So adding a node with more GPUs will linearly increase indexing speed.