NGC | Catalog
Welcome Guest
CatalogResourcesEndoscopy Sample App Data

Endoscopy Sample App Data

For downloads and more information, please view on a desktop device.
Logo for Endoscopy Sample App Data

Description

Clara Holoscan Sample App Data for AI-based Endoscopy Tool Tracking

Publisher

NVIDIA

Use Case

Other

Framework

Other

Latest Version

20220608

Modified

June 17, 2022

Compressed Size

909.36 MB

Clara Holoscan Sample App Data for AI-based Endoscopy Tool Tracking

Overview

This resource contains the convolutional LSTM model for tool tracking in laparoscopic videos by Nwoye et. al [1] and sample surgical video data, kindly provided by Research Group Camma, IHU Strasbourg and University of Strasbourg

[1] Nwoye, C.I., Mutter, D., Marescaux, J. and Padoy, N., 2019. Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. International journal of computer assisted radiology and surgery, 14(6), pp.1059-1067

Refer to the Clara Holoscan Embedded SDK GitHub repository for details on how the sample app data is utilized.

Model

The AI model for instrument tracking in endoscopy is an LSTM model which given an RGB image of 854 x 480 provides

  • per-instrument detection probability,
  • per-instrument detected tooltip location,
  • semantic segmentation of the instruments. Each pixel stores 7 labels.

For details on the model inputs and outputs see below.

Inputs

  • data_ph:0 - Input RGB image ( batchsize, height, width, channels)
    • shape=[1, 480, 854, 3]
    • dtype=float32
    • range=[0, 255]
  • cellstate_ph:0 - LSTM hidden state tensor 0
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • hiddenstate_ph:0 - LSTM hidden state tensor 1
    • shape=[1, 60, 107, 7]
    • dtype=float32

Outputs

  • Model/net_states:0 - LSTM output state tensor. Output of frame N is input for frame N + 1.
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • Model/net_states:0 - LSTM hidden state tensor. Output of frame N is input for frame N + 1.
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • probs:0 - Per instrument detection probability
    • shape=[1, 7]
    • dtype=float32
    • range=[0,1]
  • Localize/scaled_coords:0 - Per-instrument (x,y) detected tooltip location
    • shape=[1, 2, 7]
    • dtype=float32
  • Localize_1/binary_masks:0 - image with per instrument segmentation mask. Each pixel stores 7 labels
    • shape=[1, 60, 107, 7]
    • dtype=float32

Video Data

Files:

  • video/raw.mp4: Surgical video sample sequence provided by Research Group Camma, IHU Strasbourg & University of Strasbourg. The video includes all 7 instruments classes supported by the tool tracking model.
  • video/surgical_video.gxf_*: Converted raw.mp4 for use with GXF replayer extension.

Directory Structure

The package contains two folders:

  • the model folder with the source model in ONNX format, and .engine files for the TensorRT models optimized for the Clara AGX and Holoscan developer kits.
  • the video folder with the recorded video data, both in mp4 (source) and converted to the NVIDIA GXF tensor format.
/
├── CAMMA_LSTM_Model.png
├── CAMMA_NVIDIA_License.pdf
├── NVIDIA-Clara-Holoscan-SDK-EULA.pdf
├── model
│   ├── tool_loc_convlstm.onnx
│   └── tool_loc_convlstm_engines
│       ├── NVIDIA-RTX-A6000_c86_n84.engine
│       └── Quadro-RTX-6000_c75_n72.engine
└── video
    ├── raw.mp4
    ├── surgical_video.gxf_entities
    └── surgical_video.gxf_index

License

Refer to CAMMA_NVIDIA_License.pdf supplied within, and to the NVIDIA-Clara-Holoscan-SDK-EULA.pdf license agreement for use of the sample data.