NGC Catalog
CLASSIC
Welcome Guest
Containers
Container for Training Parabricks DeepVariant

Container for Training Parabricks DeepVariant

For copy image paths and more information, please view on a desktop device.
Logo for Container for Training Parabricks DeepVariant
Features
Description
This container is used for running the data preprocessing part of DeepVariant Training pipeline including make_examples and shuffle
Publisher
NVIDIA Clara Parabricks
Latest Tag
4.3.2-1
Modified
March 1, 2025
Compressed Size
1.31 GB
Multinode Support
No
Multi-Arch Support
No
4.3.2-1 (Latest) Security Scan Results

Linux / amd64

Sorry, your browser does not support inline SVG.

This container is used for running the data preprocessing part of DeepVariant Training pipeline including make_examples and shuffle. It will generate output files as tfrecord.gz which can be fed to DeepVariants model_train step.

This is a template command for running GPU accelerated make_examples:

docker run --gpus all --rm -v <DATA_DIR>:<DATA_DIR> nvcr.io/nvidia/clara/deepvariant_train:4.2.0-1 \
  pbrun make_examples --ref <REF_FILE> --reads <BAM_FILE> --truth-variants <TRUTH_VCF> --confident_regions <TRUTH_BED> \
  --examples <TFRECORD_FILE> --disable-use-window-selector-model --channel-insert-size \
  --num-gpus <GPU_NUM> --num-cpu-threads-per-stream <WORKER_THREAD_NUM> --num-zipper-threads <ZIPPER_THREAD_NUM>

This is a template command for running accelerated shuffle:

docker run --gpus all --rm -v <DATA_DIR>:<DATA_DIR> nvcr.io/nvidia/clara/deepvariant_train:4.2.0-1 \
  pbrun shuffle --input_pattern_list <INPUT_PATTERN_LIST> --output_pattern_prefix <OUTPUT_PATTERN_PREFIX> \
  --output_dataset_config <OUTPUT_PBTXT_FILE> --output_dataset_name <DATASET_NAME> --direct-num-workers <WORKER_THREAD_NUM>