Endoscopy Sample App Data

NVIDIA

Resource

NVIDIA

Endoscopy Sample App Data

Holoscan Sample App Data for AI-based Endoscopy Tool Tracking

This resource contains the convolutional LSTM model for tool tracking in laparoscopic videos by Nwoye et. al [1], and a sample surgical video.

[1] Nwoye, C.I., Mutter, D., Marescaux, J. and Padoy, N., 2019. Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. International journal of computer assisted radiology and surgery, 14(6), pp.1059-1067

Model

The AI model for instrument tracking in endoscopy is an LSTM model which given an RGB image of 854 x 480 provides

per-instrument detection probability,
per-instrument detected tooltip location,
semantic segmentation of the instruments. Each pixel stores 7 labels.

Note: The provided model is in ONNX format. It will automatically be converted into a TensorRT model (.engine) the first time it is processed by a Holoscan application.

Inputs

data_ph:0 - Input RGB image ( batchsize, height, width, channels)
- shape=[1, 480, 854, 3]
- dtype=float32
- range=[0, 255]
cellstate_ph:0 - LSTM hidden state tensor 0
- shape=[1, 60, 107, 7]
- dtype=float32
hiddenstate_ph:0 - LSTM hidden state tensor 1
- shape=[1, 60, 107, 7]
- dtype=float32

Outputs

Model/net_states:0 - LSTM output state tensor. Output of frame N is input for frame N + 1.
- shape=[1, 60, 107, 7]
- dtype=float32
Model/net_states:0 - LSTM hidden state tensor. Output of frame N is input for frame N + 1.
- shape=[1, 60, 107, 7]
- dtype=float32
probs:0 - Per instrument detection probability
- shape=[1, 7]
- dtype=float32
- range=[0,1]
Localize/scaled_coords:0 - Per-instrument (x,y) detected tooltip location
- shape=[1, 2, 7]
- dtype=float32
Localize_1/binary_masks:0 - image with per instrument segmentation mask. Each pixel stores 7 labels
- shape=[1, 60, 107, 7]
- dtype=float32

Video Data

The sample data, kindly provided by Research Group Camma, IHU Strasbourg and University of Strasbourg, is a surgical video that includes all 7 instruments classes supported by the tool tracking model. It's in a raw H264 format.

Note: the .h264 file must be converted into a GXF tensor file using the convert_video_to_gxf_entities.py script on GitHub to be used with the VideoStreamReplayer Holoscan operator.

License

Refer to the license agreement for use of the sample data.

Publisher

NVIDIA

Latest Version20230222

UpdatedApril 19, 2023 UTC

Compressed Size47.27 MB

Labels

AI Clara DL Healthcare Holoscan