Synthetic Data Generation

NGC Catalog

CLASSIC

Welcome Guest

For copy image paths and more information, please view on a desktop device.

Description

Synthetic Data Generation is an app built in Omniverse for developers and researchers to build generate synthetic data using Omniverse Replicator.

Publisher

NVIDIA

Latest Tag

0.0.16-beta

Modified

July 3, 2025

Compressed Size

2.38 GB

Multinode Support

Multi-Arch Support

0.0.16-beta (Latest) Security Scan Results

Linux / amd64

Synthetic Data Generation

Generating synthetic data in the cloud is key for scaling deep learning workflows. In this container you will have access to the Synthetic Data Generation app, an integrated development environment (IDE) for developers that empowers users to build to generate synthetic data by exposing Omniverse Replicator.

Omniverse Replicator is a highly extensible framework built on a scalable Omniverse platform that enables physically accurate 3D synthetic data generation to accelerate training and performance of AI perception networks.

Omniverse Replicator provides deep learning engineers and researchers with a set of tools and workflows to bootstrapping model training, improve the performance of existing models or develop a new type of models that were not possible due to the lack of datasets or required annotations. It allows users to easily import simulation-ready assets to build contextually aware 3D scenes to unleash a data-centric approach by creating new types of datasets and annotations previously not available.

Built on open-source standards like Universal Scene Description (USD), PhysX, Material Definition Language (MDL), Omniverse Replicator can be easily integrated or connected to existing pipelines via extensible Python APIs.

Omniverse Replicator is built on the highly extensible OmniGraph architecture that allows users to easily extend the built-in functionalities to create datasets for their own needs. It provides an extensible registry of annotators and writers to address custom requirements around type of annotations and output formats needed to train AI models. In addition, extensible randomizers allow the creation of programmable datasets that enable a data-centric approach to training these models.

Features

Semantic Schema Editor: provides a way to apply these annotations to prims on the stage through a UI.
Visualizer: enables you to visualize the semantic labels for 2D/3D bounding boxes, normals, depth and more.
Randomizers: allow developers to easily create domain randomized scenes, quickly sampling from assets, materials, lighting, and camera positions.
Omni.syntheticdata: provides low level integration with the RTX renderer, and the OmniGraph computation graph system to power the computation graphs for Replicator’s Ground Truth extraction Annotators, passing Arbitrary Output Variables or AOVs from the renderer through to the Annotators.
Annotators: ingests the AOVs and other output from the omni.syntheticdata extension to produce precisely labeled annotations for DNN training.
Writers: process the images and other annotations from the annotators, and produce DNN specific data formats for training, outputting to local storage, over the network to cloud based storage backends such as SwiftStack.

For complete list of updates and features, view release notes here.

Cloud Deployment Instructions

This release is offered as a container that runs locally or on NVIDIA RTX equipped Amazon Web Services (AWS). This cloud-based delivery provides the latest RTX graphics and performance to any desktop system without requiring local NVIDIA RTX GPUs.

For Cloud Deployment installation steps, view documentation here.

Prerequisites

Using the NGC Container requires the host system to have the following installed:

For supported versions, see the NVIDIA Container Toolkit Documentation.

No other installation, compilation, or dependency management is required. It is not necessary to install the NVIDIA CUDA Toolkit.

Starting Omniverse Replicator

To run a container, issue the appropriate command as explained in the Running A Container chapter in the NVIDIA Containers For Deep Learning Frameworks User’s Guide and specify the registry, repository, and tags. For more information about using NGC, refer to the NGC Container User Guide.

To pull the container first make sure to log in to the NGC docker registry.

If you have Docker 19.03 or later, a typical command to launch the container is:

docker run --gpus all --entrypoint /bin/bash -it nvcr.io/nvidian/ov-synthetic-data-generation:xx

If you have Docker 19.02 or earlier, a typical command to launch the container is:

nvidia-docker run --entrypoint /bin/bash -it nvcr.io/nvidian/ov-synthetic-data-generation:xx

Where:

xx is the container version. For example, 1.10.9.

Running Omniverse Replicator

Within the container you are ready to run Omniverse Replicator. From the container, you can run the script shown here. With the following command:

./startup.sh --no-window --/omni/replicator/script=test.py

After this, you will see an output folder with the images of basic shapes.

Note: Launching this for the first time will have a start up time of about 2 minutes, consecutive runs will be much faster. For fixing this, check the next section.

Accelerating Start up time

You will notice that the first time you launch the container, it has a lengthy start up time of about 2 minutes due to compiling shaders regardless of how much data you are generating. To minimize the start up time, and make sure you can deploy the container in the machine over and over again without the start up time you can follow the next steps:

nvidia-docker run --entrypoint /bin/bash -it nvcr.io/nvidian/ov-synthetic-data-generation:xx

Within the container run:

./cache_script.sh

After that script has run, on a different terminal commit the container (for more info on docker commit, click here)

docker commit [OPTIONS] CONTAINER ov-synthetic-data-generation-startup:v1

CONTAINER here refers to the container on the other terminal. You can find it using docker container ls

After running this, you can close the container and run with the container you committed. Shaders will recompile if you launch this container on a new machine or if the driver is slightly different. Even a patch will make the difference.

Refer to our Omniverse Replicator User Guide for more information.

License

By pulling and using the container, you accept the terms and conditions of the NVIDIA Omniverse License Agreement.

Synthetic Data Generation

Synthetic Data Generation

Features

Cloud Deployment Instructions

Prerequisites

Starting Omniverse Replicator

Running Omniverse Replicator

Accelerating Start up time

License

Suggested readings