NGC | Catalog


For downloads and more information, please view on a desktop device.
Logo for clara_pt_deepgrow_3d_annotation


The trained 3D model can be used with the AIAA server with Slicer, Fovia and OHIF for annotating abdominal organs



Use Case



Clara Train

Latest Version



March 25, 2022


460.79 MB

Model Overview

The trained model can be used with the AIAA server with Slicer, Fovia and OHIF for annotating abdominal organs. The best results are expected on the organs that this model was trained upon, which is mentioned in the Data section. The model maybe applicable to unseen organs and unseen data however the performance of the annotation is not guaranteed.

Note: The 4.1 version of this model is only compatible with the 4.1 version of the Clara Train SDK container

Model Architecture

A 3D UNet [1] with residual blocks and 32 channels has been used with 5 encoding levels. The network has 22.62 million trainable parameters.

Below is a pipeline indicating dataset preparation where 3D volumes are split into cubic 3D patches. The prepared dataset is utilized for training a Deepgrow [3] model.



The current training configuration apart from traditional deep learning hyper-parameters is set for 15 click interactions for both training and validation. Click interactions define how many positive and negative clicks are provided for the additional feature maps in simulation before a single step/iteration is taken for the deep learning model for a batch of images. Single and multi-GPU training options are available.

  • Script:
  • GPU: At least 16GB of GPU memory (The provided model was trained in parallel with 8 GPU's).
  • Actual Model Input: 128 x 128 x 128
  • AMP: False
  • Optimizer: Adam
  • Learning Rate: 1e-4
  • Loss: DiceLoss

The training utilizes patches of 128 x 128 x 128 which are extracted based on the label for the training process. The positive and negative click maps are automatically handled internally.


The training data is from the MICCAI 2015 Challenge: Multi-Atlas Labeling Beyond The Cranial Vault [2]. Link to the challenge data:!Synapse:syn3193805/wiki/217789 The dataset has 13 labeled organs and all organs were utilized for training this model. Further details about the data can be found at the aforementioned link. Medical Segmentation Decathlon (MSD) ( data was also used for testing for Spleen and Liver tasks.

  • Target: Annotating Region of Interest
  • Task: Annotation
  • Modality: CT
  • Size: 30 3D Volumes (After preparing into 3D patches, Train and Validation split was 80/20). Additional data from MSD was used for testing for Spleen (10 3D Volumes) and Liver Tasks (10 3D Volumes).

The data must be converted to 1mm x 1mm x 3mm resolution before training

Run sh --help from *MMAR/commands folder to know more options to prepare the dataset for training


Testing was performed on medical segmentation decathlon dataset ( for tasks of Spleen and Liver. For a random selection of 10 volumes the following Dice scores were achieved. ~0.95 on Spleen, ~0.92 on Liver with 10-15 clicks for entire 3D volume.


A graph showing the validation mean dice over N Steps/Epochs for Spleen.
Achieves Dice scores of ~0.84 and ~0.85 for training and validation on a 80/20 split. ~0.95 and ~0.92 are achieved for Spleen and Liver during validation.


How to Use this Model

The model was validated with NVIDIA hardware and software. For hardware, the model can run on any NVIDIA GPU with memory greater than 16 GB. For software, this model is usable only as part of Transfer Learning & Annotation Tools in Clara Train SDK container. Find out more about Clara Train at the Clara Train Collections on NGC.

Full instructions for the training and validation workflow can be found in our documentation.


Input: 3 channel 2D CT image/slice with normalized intensity in HU and fixed spacing 1 x 1 x 1mm

The first channel is the image/slice, the other two channels are positive and negative guidance maps that are based on user interaction/clicks.


  1. Resampling spacing as 1 x 1 x 3mm
  2. Normalizing with subtrahend at 208.0 and divisor at 308.0 HU
  3. Converting to channel first


Output: 1 channel

  • Label 0: everything else
  • Label 1: object/region of interest


This training and inference pipeline was developed by NVIDIA. It is based on a segmentation model developed by NVIDIA researchers. This research use only software has not been cleared or approved by FDA or any regulatory agency. Clara pre-trained models are for developmental purposes only and cannot be used directly for clinical procedures.


[1] Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention 2016 Oct 17 (pp. 424-432). Springer, Cham.

[2] Landman, B., et al. "Multi-atlas labeling beyond the cranial vault." URL: (2015).

[3] Sakinis, Tomas, et al. "Interactive segmentation of medical images through fully convolutional neural networks." arXiv preprint arXiv:1903.08205 (2019).


End User License Agreement is included with the product. Licenses are also available along with the model application zip file. By pulling and using the Clara Train SDK container and downloading models, you accept the terms and conditions of these licenses.