NVIDIA
NVIDIA
Endoscopy Sample App Data
Resource
NVIDIA
NVIDIA
Endoscopy Sample App Data

Holoscan Sample App Data for AI-based Endoscopy Tool Tracking

Holoscan Sample App Data for AI-based Endoscopy Tool Tracking

This resource contains the convolutional LSTM model for tool tracking in laparoscopic videos by Nwoye et. al [1], and a sample surgical video.

[1] Nwoye, C.I., Mutter, D., Marescaux, J. and Padoy, N., 2019. Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. International journal of computer assisted radiology and surgery, 14(6), pp.1059-1067

Model

The AI model for instrument tracking in endoscopy is an LSTM model which given an RGB image of 854 x 480 provides

  • per-instrument detection probability,
  • per-instrument detected tooltip location,
  • semantic segmentation of the instruments. Each pixel stores 7 labels.

Note: The provided model is in ONNX format. It will automatically be converted into a TensorRT model (.engine) the first time it is processed by a Holoscan application.

Inputs

  • data_ph:0 - Input RGB image ( batchsize, height, width, channels)
    • shape=[1, 480, 854, 3]
    • dtype=float32
    • range=[0, 255]
  • cellstate_ph:0 - LSTM hidden state tensor 0
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • hiddenstate_ph:0 - LSTM hidden state tensor 1
    • shape=[1, 60, 107, 7]
    • dtype=float32

Outputs

  • Model/net_states:0 - LSTM output state tensor. Output of frame N is input for frame N + 1.
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • Model/net_states:0 - LSTM hidden state tensor. Output of frame N is input for frame N + 1.
    • shape=[1, 60, 107, 7]
    • dtype=float32
  • probs:0 - Per instrument detection probability
    • shape=[1, 7]
    • dtype=float32
    • range=[0,1]
  • Localize/scaled_coords:0 - Per-instrument (x,y) detected tooltip location
    • shape=[1, 2, 7]
    • dtype=float32
  • Localize_1/binary_masks:0 - image with per instrument segmentation mask. Each pixel stores 7 labels
    • shape=[1, 60, 107, 7]
    • dtype=float32

Video Data

The sample data, kindly provided by Research Group Camma, IHU Strasbourg and University of Strasbourg, is a surgical video that includes all 7 instruments classes supported by the tool tracking model. It's in a raw H264 format.

Note: the .h264 file must be converted into a GXF tensor file using the convert_video_to_gxf_entities.py script on GitHub to be used with the VideoStreamReplayer Holoscan operator.

License

Refer to the license agreement for use of the sample data.

Publisher
NVIDIA
NVIDIA
Latest Version20230222
UpdatedApril 19, 2023 UTC
Compressed Size47.27 MB

NVIDIA uses cookies to improve your experience on our web site. We and our third-party partners also use cookies and other tools to collect and record information you provide as well as information about your interactions with our websites for performance improvement, analytics, and to assist in marketing efforts. By clicking "Accept All", you consent to our use of cookies and other tools as described in our Cookie Policy. You can manage your cookie settings by clicking on "Manage Settings." By continuing to use this site or by clicking one of the buttons below, you agree to our Terms of Service (which contains important waivers). Please see our Privacy Policy for more information on our privacy practices.