To train your model using mixed or TF32 precision with Tensor Cores or using FP32, perform the following steps using the default parameters of the 3D-UNet model on the Brain Tumor Segmentation 2019 dataset. These steps enable you to build the 3D-UNet TensorFlow NGC container, train and evaluate your model, and generate predictions on the test data. For the specifics concerning training and inference, see the Advanced section.
Clone the repository.
Executing this command will create your local repository with all the code to run 3D-UNet.
git clone https://github.com/NVIDIA/DeepLearningExamples
cd DeepLearningExamples/TensorFlow/Segmentation/U-Net3D_TF
Build the U-Net TensorFlow NGC container.
This command will use the Dockerfile
to create a Docker image named unet3d_tf
, downloading all the required components automatically.
docker build -t unet3d_tf .
The NGC container contains all the components optimized for usage on NVIDIA hardware.
Start an interactive session in the NGC container to run preprocessing/training/inference.
The following command will launch the container and mount the ./data
directory as a volume to the /data
directory inside the container, and ./results
directory to the /results
directory in the container.
mkdir data
mkdir results
docker run --runtime=nvidia -it --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --rm --ipc=host -v ${PWD}/data:/data -v ${PWD}/results:/results unet3d_tf:latest /bin/bash
Any datasets and experiment results (logs, checkpoints, etc.) saved to /data
or /results
will be accessible
in the ./data
or ./results
directory on the host, respectively.
Download and pre-process the data.
Data can be obtained by registering on Brain Tumor Segmentation 2019 dataset website. The data should be downloaded and placed where /data
in the container is mounted. The dataset/preprocess_data.py
script will convert the raw data into tfrecord format used for training and evaluation.
The script can be launched as
python dataset/preprocess_data.py -i /data/<name/of/the/raw/data/folder> -o /data/<name/of/the/preprocessed/data/folder> -v
Start training.
After the Docker container is launched, the training of a single fold (fold 0) with the default hyperparameters (for example 1/8 GPUs TF-AMP/FP32/TF32) can be started with:
bash scripts/unet3d_train_single{_TF-AMP}.sh <number/of/gpus> <path/to/dataset> <path/to/checkpoint> <batch/size>
For example, to run with 32-bit precision (FP32 or TF32) with batch size 2 on 1 GPU, simply use:
bash scripts/unet3d_train_single.sh 1 /data/preprocessed /results 2
to train a single fold with mixed precision (TF-AMP) with on 8 GPUs batch size 2 per GPU, use:
bash scripts/unet3d_train_single_TF-AMP.sh 8 /data/preprocessed /results 2
The obtained dice scores will be reported after the training has finished.
Start benchmarking.
The training performance can be evaluated by using benchmarking scripts, such as:
bash scripts/unet3d_{train,infer}_benchmark{_TF-AMP}.sh <number/of/gpus/for/training> <path/to/dataset> <path/to/checkpoint> <batch/size>
which will make the model run and report the performance. For example, to benchmark training with TF-AMP with batch size 2 on 4 GPUs, use:
bash scripts/unet3d_train_benchmark_TF-AMP.sh 4 /data/preprocessed /results 2
to obtain inference performance with 32-bit precision (FP32 or TF32) with batch size 1, use:
bash scripts/unet3d_infer_benchmark.sh /data/preprocessed /results 1