Linux / amd64
More info https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/collections/claradeploy
This asset requires the Clara Deploy SDK. Follow the instructions on the Clara Ansible page to install the Clara Deploy SDK.
This example is a containerized AI inference application, developed for use as one of the operators in the Clara Deploy pipelines. The application is built on the base AI application container, which provides the application framework to deploy Clara Train TLT trained models. The same execution configuration file, set of transform functions, and scanning window inference logic are used; however, inference is performed on the Triton Inference Server.
This application, in the form of a Docker container, expects an input folder (/input
by default),
which can be mapped to the host volume when the Docker container is started. This folder must
contain a volume image file in the Nifti or MetaImage
format. Furthermore, the volume image must be constructed from a single series of a
DICOM study, typically an axial series with the data type of the original primary.
This application saves the segmentation results to an output folder, /output
by default,
which can also be mapped to a folder on the host volume. After the application completes successfully,
a segmentation volume image of format MetaImage is
saved in the output folder. The name of the output file is the same as that of the input file due to
certain limitations of the downstream consumer.
The example container also publishes data for the Clara Deploy Render Server to the
/publish
folder by default. The original volume image, segmented volume image, and metadata file,
along with a render configuration file, are saved in this folder.
The application uses the segmentation_ct_pancreas_and_tumor
model, which uses the
tensorflow_graphdef
platform. The input tensor is of shape 96x96x96
with a single channel.
The output is of the same shape with three channels.
The NVIDIA® Clara Train Transfer Learning Toolkit (TLT) for Medical Imaging provides pre-trained models unique to medical imaging, with additional capabilities such as integration with the AI-assisted Annotation SDK for speeding up annotation of medical images. This allows access to AI-assisted labeling [Reference].
The application uses the segmentation_ct_pancreas_and_tumor
model provided by the NVIDIA Clara Train
TLT for pancreas tumor segmentation, which is converted from the
TensorFlow Checkpoint model to tensorflow_graphdef
using the TLT model export tool.
You can download the model using the following commands:
# Download NGC Catalog CLI
wget https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip && unzip ngccli_cat_linux.zip && rm ngccli_cat_linux.zip ngc.md5 && chmod u+x ngc
# Configure API key (Refer to https://docs.nvidia.com/ngc/ngc-getting-started-guide/index.html#generating-api-key)
./ngc config set
# Download the model
./ngc registry model download-version nvidia/med/segmentation_ct_pancreas_and_tumor:1
Note: NGC Catalog CLI is needed to download models without Clara Train SDK: Please follow the NGC documentation to configure the CLI API key.
Detailed model information can be found at (downloaded model folder)/docs/Readme.md
.
This application also uses the same transforms library and configuration file for
the validation/inference pipeline during TLT model training. The key model attributes (e.g.
the model name and network input dimensions), are saved in the config_inference.json
file and consumed by the application at runtime.
The application performs inference on the NVIDIA Triton Inference Server, which provides a cloud inferencing solution optimized for NVIDIA GPUs. The server provides an inference service via an HTTP or gRPC endpoint, allowing remote clients to request inferencing for any model being managed by the server.
The application source code files are in the directory structure shown below.
/
├── app_base_inference_v2
├── ai4med
├── config
│ ├── config_render.json
│ ├── config_inference.json
│ └── __init__.py
├── dlmed
├── inferers
├── model_loaders
├── ngc
├── public
├── utils
├── writers
├── app.py
├── Dockerfile
├── executor.py
├── logging_config.json
├── main.py
└── requirements.txt
The following describes the directory contents:
ai4med
and dlmed
directories contain the library modules shared with Clara Train SDK, mainly for its transforms functions and base inference client classes.config
directory contains model-specific configuration files, which is needed when building a customized container for a specific model.config_inference.json
file contains the configuration sections for pre- and post-transforms, as well as the model loader, inferer, and writer.config_render.json
contains the configuration for the Clara Deploy Render Server.inferers
directory contains the implementation of the simple and scanning window inference client using the Triton API client librarymodel_loaders
directory contains the implementation of the model loader that gets model details from Triton Inference Server. ngc
and public
directories contain the user documentation.utils
directory contains utilities for loading modules and creating application objects.Writers
directory contains the specialized output writer required by Clara Deploy SDK, which saves the segmentation result to a volume image file as MetaImage.Triton
has been imported into the local Docker repository with
the following command:docker images | grep tritonserver
tritonserver
and the correct tag for the release, e.g. 20.07-v1-py3
. If the image does not exist locally, it will be pulled from NVIDIA Docker registry.MODEL SCRIPTS
section for this container on NGC, following the steps in the Setup
section.Change to your working directory (e.g. test_pancreas_tumor
).
Create, if they do not exist, the following directories under your working directory:
input
containing the input image fileoutput
for the segmentation outputpublish
for publishing data for the Render Serverlogs
for the log filesmodels
containing models copied from the segmentation_ct_pancreas_and_tumor_v1
folderIn your working directory,
run_docker.sh
, or another name if you prefer.APP_NAME
to the full name of this docker, e.g. nvcr.io/ea-nvidia-clara/clara/ai-pancreastumor:0.7.2-2009.3
.#!/bin/bash
# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.
# Define the name of the app (aka operator); assumed the same as the project folder name
APP_NAME="app_pancreastumor"
# Define the model name for use when launching TRTIS with only the specific model
MODEL_NAME="segmentation_ct_pancreas_and_tumor_v1"
# Define the Triton Inference Server Docker image, which will be used for testing
# Use either local repo or NVIDIA repo
TRITON_IMAGE="nvcr.io/nvidia/tritonserver:20.07-v1-py3"
# Launch the container with the following environment variables
# to provide runtime information
export NVIDIA_CLARA_TRTISURI="localhost:8000"
# Create a Docker network so that containers can communicate on this network
NETWORK_NAME="container-demo"
# Create network
docker network create ${NETWORK_NAME}
# Run TRTIS(name: triton), maping ./models/${MODEL_NAME} to /models/${MODEL_NAME}
# (localhost:8000 will be used)
RUN_TRITON="nvidia-docker run --name triton --network ${NETWORK_NAME} -d --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 \
-p 8000:8000 \
-v $(pwd)/models/${MODEL_NAME}:/models/${MODEL_NAME} ${TRITON_IMAGE} \
tritonserver --model-repository=/models"
# Display the command
echo ${RUN_TRITON}
# Run the command to start the inference server Docker
eval ${RUN_TRITON}
# Wait until Triton is ready
triton_local_uri=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' triton)
echo -n "Wait until Triton ${triton_local_uri} is ready..."
while [ $(curl -s ${triton_local_uri}:8000/api/status | grep -c SERVER_READY) -eq 0 ]; do
sleep 1
echo -n "."
done
echo "done"
export NVIDIA_CLARA_TRTISURI="${triton_local_uri}:8000"
# Run ${APP_NAME} container
# Launch the app container with the following environment variables internally,
# to provide input/output path information
docker run --name ${APP_NAME} --network ${NETWORK_NAME} -it --rm \
-v $(pwd)/input:/input \
-v $(pwd)/output:/output \
-v $(pwd)/logs:/logs \
-v $(pwd)/publish:/publish \
-e NVIDIA_CLARA_TRTISURI \
-e DEBUG_VSCODE \
-e DEBUG_VSCODE_PORT \
-e NVIDIA_CLARA_NOSYNCLOCK=TRUE \
${APP_NAME}
echo "${APP_NAME} is done."
# Stop Triton container
echo "Stopping Triton"
docker stop triton > /dev/null
# Remove network
docker network remove ${NETWORK_NAME} > /dev/null
Execute the script as shown below and wait for the application container to finish:
./run_docker.sh
Check for the following output files:
output
directory:.mhd
.raw
publish
directory:.output.mhd
and .output.raw
)config_render.json
)config.meta
)To visualize the segmentation results, any tool that support MHD or NFiTI can be used, e.g. 3D Slicer.
To see the internals of the container or to run the application within the container, please follow the following steps.
docker run
command with the following:docker run -it --rm --entrypoint /bin/bash
/
.python3 ./app_base_inference_v2/main.py
exit
.An End User License Agreement is included with the product. By pulling and using the Clara Deploy asset on NGC, you accept the terms and conditions of these licenses.
Release Notes, the Getting Started Guide, and the SDK itself are available at the NVIDIA Developer forum.
For answers to any questions you may have about this release, visit the NVIDIA Devtalk forum.