The BodyPose3DNet models described in this card are used for 3D human pose estimation network, which aims to predict the skeleton for every person in a given input image which consists of keypoints and the connections between them. 3D body pose Tracking has many practical applications such as action understanding, surveillance, human-robot interaction, motion capture and CGI, augmented and virtual reality, assisted living, advanced driver assistance systems (ADAS) and sport analysis, AI-powered sports coaches, Workplace activity monitoring, Crowd counting and tracking, allowing for character animation that doesn’t rely on markers or specialized suits.
Given an RGB image, we want to track the 2D and 3D poses of the body with 34 joints of the body: pelvis, left_hip, right_hip, torso, left_knee, right_knee, neck, left_ankle, right_ankle, left_big_toe, right_big_toe, left_small_toe, right_small_toe, left_heel, right_heel, nose, left_eye, right_eye, left_ear, right_ear, left_shoulder, right_shoulder, left_elbow, right_elbow, left_wrist, right_wrist, left_pinky_knuckle, right_pinky_knuckle, left_middle_tip, right_middle_tip, left_index_knuckle, right_index_knuckle, left_thumb_tip, right_thumb_tip. The picture demonstrates the predicted skeleton, constructed from the above 34 keypoints, overlaid on the original video frame.
Architecture Type: Deep Convolutional Neural Network.
Network Architecture: HRNet
The models in this page can only be used with Train Adapt Optimize (TAO) Toolkit. TAO provides a simple command line interface to train a deep learning model for 3D body pose estimation.
Primary use case for this model is to detect human poses in a given RGB image. BodyPose3DNet is commonly used for activity/gesture recognition, fall detection, posture analysis etc.
Install the NGC CLI from ngc.nvidia.com
Configure the NGC CLI using the following command
ngc config set
ngc registry model list nvidia/tao/bodypose3dnet:*
ngc registry model download-version nvidia/tao/bodypose3dnet:<template> --dest <path>
Input tensor 0:
name: input0
elem_type: float32
shape: -1 x 3 x 256 x 192
Input tensor 1:
name: k_inv
elem_type: float32
shape: -1 x 3 x 3
Input tensor 2:
name: t_form_inv
elem_type: float32
shape: -1 x 3 x 3
Input tensor 3:
name: scale_normalized_mean_limb_lengths
elem_type: float32
shape: -1 x 36
Input tensor 4:
name: mean_limb_lengths
elem_type: float32
shape: -1 x 36
Network outputs four tensors:
Output tensor 0:
name: pose2d
elem_type: float32
shape: -1 x 34 x 3 [x, y, c]
Output tensor 1:
name: pose2d_org_img
elem_type: float32
shape: -1 x 34 x 3 [x, y, c]
Output tensor 2:
name: pose25d
elem_type: float32
shape: -1 x 34 x 4 [x, y, depth, c]
Output tensor 3:
name: pose3d
elem_type: float32
shape: -1 x 34 x 3 [x, y, z]
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, please visit this link, or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.