NGC | Catalog
Welcome Guest
CatalogModelsBodyPose3DNet

BodyPose3DNet

For downloads and more information, please view on a desktop device.
Logo for BodyPose3DNet

Description

3D human pose estimation network to predict 34 keypoints in 3D of a person in an image.

Publisher

NVIDIA

Use Case

Pose Estimation

Framework

Transfer Learning Toolkit

Latest Version

deployable_performance_v1.0

Modified

December 9, 2021

Size

38.49 MB

BodyPose3DNet

Model Overview

The BodyPose3DNet models described in this card are used for 3D human pose estimation network, which aims to predict the skeleton for every person in a given input image which consists of keypoints and the connections between them. 3D body pose Tracking has many practical applications such as action understanding, surveillance, human-robot interaction, motion capture and CGI, augmented and virtual reality, assisted living, advanced driver assistance systems (ADAS) and sport analysis, AI-powered sports coaches, Workplace activity monitoring, Crowd counting and tracking, allowing for character animation that doesn’t rely on markers or specialized suits.

Given an RGB image, we want to track the 2D and 3D poses of the body with 34 joints of the body: pelvis, left_hip, right_hip, torso, left_knee, right_knee, neck, left_ankle, right_ankle, left_big_toe, right_big_toe, left_small_toe, right_small_toe, left_heel, right_heel, nose, left_eye, right_eye, left_ear, right_ear, left_shoulder, right_shoulder, left_elbow, right_elbow, left_wrist, right_wrist, left_pinky_knuckle, right_pinky_knuckle, left_middle_tip, right_middle_tip, left_index_knuckle, right_index_knuckle, left_thumb_tip, right_thumb_tip. The picture demonstrates the predicted skeleton, constructed from the above 34 keypoints, overlaid on the original video frame.

Fig 1. Example illustration of BodyPose3DNet output

Model Architecture

Architecture Type: Deep Convolutional Neural Network.

Network Architecture: HRNet

How to use this model

The models in this page can only be used with Train Adapt Optimize (TAO) Toolkit. TAO provides a simple command line interface to train a deep learning model for 3D body pose estimation.

Primary use case for this model is to detect human poses in a given RGB image. BodyPose3DNet is commonly used for activity/gesture recognition, fall detection, posture analysis etc.

  1. Install the NGC CLI from ngc.nvidia.com

  2. Configure the NGC CLI using the following command

ngc config set
  1. To view all the models that are supported in TAO:
ngc registry model list nvidia/tao/bodypose3dnet:*
  1. To download the model:
ngc registry model download-version nvidia/tao/bodypose3dnet:<template> --dest <path>

Input

Input tensor 0: 
    name:  input0
    elem_type: float32
    shape: -1 x 3 x 256 x 192

Input tensor 1: 
    name:  k_inv
    elem_type: float32
    shape: -1 x 3 x 3

Input tensor 2: 
    name:  t_form_inv
    elem_type: float32
    shape: -1 x 3 x 3

Input tensor 3: 
    name:  scale_normalized_mean_limb_lengths
    elem_type: float32
    shape: -1 x 36

Input tensor 4: 
    name:  mean_limb_lengths
    elem_type: float32
    shape: -1 x 36

Output

Network outputs four tensors:

Output tensor 0: 
    name: pose2d
    elem_type: float32
    shape: -1 x 34 x 3 [x, y, c]

Output tensor 1: 
    name: pose2d_org_img
    elem_type: float32
    shape: -1 x 34 x 3 [x, y, c]

Output tensor 2: 
    name: pose25d
    elem_type: float32
    shape: -1 x 34 x 4 [x, y, depth, c]

Output tensor 3: 
    name: pose3d
    elem_type: float32
    shape: -1 x 34 x 3 [x, y, z]

Limitations

  1. It only supports one person in the scene.
  2. Full body of the person needs to be visible.

Model versions

  • deployable_accuracy_v1.0: A deployable model with better accuracy but more computations.
  • deployable_performance_v1.0: A deployable model which is more optimized for inference performance.

Reference

Using TAO Pre-trained Models

Technical blogs

Suggested reading

License

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, please visit this link, or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Ethical Considerations

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.