NGC | Catalog


For downloads and more information, please view on a desktop device.
Logo for clara_pt_deepgrow_2d_annotation


The trained 2D model can be used with the AIAA server with Slicer, Fovia and OHIF for annotating abdominal organs.



Use Case



Clara Train

Latest Version



March 25, 2022


61.32 MB

Model Overview

The trained model can be used with the AIAA server with Slicer, Fovia and OHIF for annotating abdominal organs. The best results are expected on the organs that this model was trained upon, which is mentioned in the Data section. The model maybe applicable to unseen organs and unseen data however the performance of the annotation is not guaranteed.

Note: The 4.1 version of this model is only compatible with the 4.1 version of the Clara Train SDK container

Model Architecture

A 2D UNet [1] with residual blocks and 32 channels has been used with 5 encoding levels. The network has 7.28 million trainable parameters.

Below is a pipeline indicating dataset preparation where 3D volumes are split into 2D slices. The prepared dataset is utilized for training a Deepgrow [3] model



The training was performed with the following:

  • Script:
  • GPU: Atleast 12GB of GPU memory.
  • Actual Model Input: 512 x 512
  • AMP: False
  • Optimizer: Adam
  • Learning Rate: 1e-4
  • Loss: DiceLoss

The current training configuration apart from traditional deep learning hyper-parameters is set for 15 click interactions for both training and validation. Click interactions define how many positive and negative clicks are provided for the additional feature maps in simulation before a single step/iteration is taken for the deep learning model for a batch of images. Single and multi-GPU training options are available.

The input size of the image is 512x512 as a 2D slice, and the positive and negative click maps are automatically handled internally. The input size for the training can be varied by modifying the input argument for


The training data is from the MICCAI 2015 Challenge: Multi-Atlas Labeling Beyond The Cranial Vault [2]. Link to the challenge data:!Synapse:syn3193805/wiki/217789 The dataset has 13 labeled organs and all organs were utilized for training this model. Further details about the data can be found at the aforementioned link. Medical Segmentation Decathlon (MSD) ( data was also used for testing for Spleen and Liver tasks.

  • Target: Annotating Region of Interest
  • Task: Annotation
  • Modality: CT
  • Size: 30 3D Volumes (After preparing into 2D Slices, Train and Validation split was 80/20). Additional data from MSD was used for testing for Spleen (10 3D Volumes) and Liver Tasks (10 3D Volumes).

The data must be converted to 1mm x 1mm resolution before training

Run sh --help from *MMAR/commands folder to know more options to prepare the dataset for training


Testing was performed on medical segmentation decathlon dataset ( for tasks of Spleen and Liver. For a random selection of 10 volumes the following Dice scores were achieved. ~0.96 on Spleen, ~0.92 on Liver with 10-15 clicks per 2D slice.


A graph showing the validation mean dice over N Steps/Epochs for Spleen. Any additional information that they might need to know about the graph.

Achieves Dice scores of ~0.96 and ~0.95 for training and validation on a 80/20 split.


How to Use this Model

The model was validated with NVIDIA hardware and software. For hardware, the model can run on any NVIDIA GPU with memory greater than 16 GB. For software, this model is usable only as part of Transfer Learning & Annotation Tools in Clara Train SDK container. Find out more about Clara Train at the Clara Train Collections on NGC.

Full instructions for the training and validation workflow can be found in our documentation.


Input: 3 channel 2D CT image/slice with normalized intensity in HU and fixed spacing (1 x 1 x 1mm).

The first channel is the image/slice, the other two channels are positive and negative guidance maps that are based on user interaction/clicks


  1. Resampling spacing as 1 x 1mm
  2. Normalizing with subtrahend at 208.0 and divisor at 308.0 HU
  3. Converting to channel first


Output: 1 channel

  • Label 0: everything else
  • Label 1: object/region of interest


This training and inference pipeline was developed by NVIDIA. This research use only software has not been cleared or approved by FDA or any regulatory agency. Clara pre-trained models are for developmental purposes only and cannot be used directly for clinical procedures.


[1] Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention 2016 Oct 17 (pp. 424-432). Springer, Cham.

[2] Landman, B., et al. "Multi-atlas labeling beyond the cranial vault." URL: (2015).

[3] Sakinis, Tomas, et al. "Interactive segmentation of medical images through fully convolutional neural networks." arXiv preprint arXiv:1903.08205 (2019).


End User License Agreement is included with the product. Licenses are also available along with the model application zip file. By pulling and using the Clara Train SDK container and downloading models, you accept the terms and conditions of these licenses.