Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task.
Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:
This notebook shows an example usecase of ActionRecognitionNet using Train Adapt Optimize (TAO) Toolkit.
When using the purpose-built pretrained models from NGC, please make sure to set the $KEY
environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.
The TAO launcher uses docker containers under the hood, and for our data and results directory to be visible to the docker, they need to be mapped. The launcher can be configured using the config file ~/.tao_mounts.json
. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher.
IMPORTANT NOTE:
The code below creates a sample ~/.tao_mounts.json
file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.
%env HOST_DATA_DIR=/absolute/path/to/your/host/data
# note: You could set the HOST_SPECS_DIR to folder of the experiments specs downloaded with the notebook
%env HOST_SPECS_DIR=/absolute/path/to/your/host/specs
%env HOST_RESULTS_DIR=/absolute/path/to/your/host/results
# Set your encryption key, and use the same key for all commands
%env KEY = nvidia_tao
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
"Mounts":[
{
"source": os.environ["HOST_DATA_DIR"],
"destination": "/data"
},
{
"source": os.environ["HOST_SPECS_DIR"],
"destination": "/specs"
},
{
"source": os.environ["HOST_RESULTS_DIR"],
"destination": "/results"
},
{
"source": os.path.expanduser("~/.cache"),
"destination": "/root/.cache"
}
],
"DockerOptions": {
"shm_size": "16G",
"ulimits": {
"memlock": -1,
"stack": 67108864
}
}
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
json.dump(tlt_configs, mfile, indent=4)
!cat ~/.tao_mounts.json
The TAO launcher is a python package distributed as a python wheel listed in the nvidia-pyindex
python index. You may install the launcher by executing the following cell.
Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this page to set up a python virtual env using the virtualenv
and virtualenvwrapper
packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the VIRTUALENVWRAPPER_PYTHON
variable. You may do so by running
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
where x >= 6 and <= 8
We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below
docker login nvcr.io
You will be triggered to enter a username and password. The username is $oauthtoken
and the password is the API key generated from ngc.nvidia.com
. Please follow the instructions in the NGC setup guide to generate your own API key.
Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python >=3.6.9. You may follow the instruction in this page to set up a python virtual env using the virtualenv and virtualenvwrapper packages.
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tao
# View the versions of the TAO launcher
!tao info
We will be using the HMDB51 dataset for the tutorial. Download the HMDB51 dataset and unrar them firstly (We choose fall_floor/ride_bike for this tutorial):
# install unrar
# NOTE: The following commands require `sudo`. You can run the command outside the notebook.
!apt update
!apt-get install unrar
# download the dataset and unrar the files
!wget -P $HOST_DATA_DIR http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/hmdb51_org.rar
!mkdir -p $HOST_DATA_DIR/videos && unrar x $HOST_DATA_DIR/hmdb51_org.rar $HOST_DATA_DIR/videos
!mkdir -p $HOST_DATA_DIR/raw_data
!unrar x $HOST_DATA_DIR/videos/fall_floor.rar $HOST_DATA_DIR/raw_data
!unrar x $HOST_DATA_DIR/videos/ride_bike.rar $HOST_DATA_DIR/raw_data
Clone the dataset process script
!git clone https://github.com/NVIDIA-AI-IOT/tao_toolkit_recipes
Install the dependency for data generator:
!pip3 install xmltodict opencv-python
Run the process script.
!cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && bash ./preprocess_HMDB_RGB.sh $HOST_DATA_DIR/raw_data $HOST_DATA_DIR/processed_data
# download the split files and unrar
!wget -P $HOST_DATA_DIR http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/test_train_splits.rar
!mkdir -p $HOST_DATA_DIR/splits && unrar x $HOST_DATA_DIR/test_train_splits.rar $HOST_DATA_DIR/splits
# run split_HMDB to generate training split
!cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && python3 ./split_dataset.py $HOST_DATA_DIR/processed_data $HOST_DATA_DIR/splits/testTrainMulti_7030_splits $HOST_DATA_DIR/train $HOST_DATA_DIR/test
# verify
!ls -l $HOST_DATA_DIR/train
!ls -l $HOST_DATA_DIR/train/ride_bike
!ls -l $HOST_DATA_DIR/test
!ls -l $HOST_DATA_DIR/test/ride_bike
We also provide scripts to preprocess SHAD dataset. The following cells for processing SHAD dataset is Optional
.
OPTIONAL:
Download the app based on NVOF SDK to generate optical flow. It is packaged with this notebook.
#!echo <passwd> | sudo -S apt install -y libfreeimage-dev
# # create train and test dir for raw videos and labels
# !mkdir -p $HOST_DATA_DIR/train_raw && mkdir -p $HOST_DATA_DIR/test_raw
# # download the dataset and unrar the files
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Bend-train.rar
# !unrar x $HOST_DATA_DIR/Bend-train.rar $HOST_DATA_DIR/train_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Bend-test.rar
# !unrar x $HOST_DATA_DIR/Bend-test.rar $HOST_DATA_DIR/test_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Fall-train.rar
# !unrar x $HOST_DATA_DIR/Fall-train.rar $HOST_DATA_DIR/train_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Fall-test.rar
# !unrar x $HOST_DATA_DIR/Fall-test.rar $HOST_DATA_DIR/test_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Squa-train.rar
# !unrar x $HOST_DATA_DIR/Squa-train.rar $HOST_DATA_DIR/train_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Squa-test.rar
# !unrar x $HOST_DATA_DIR/Squa-test.rar $HOST_DATA_DIR/test_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Sits-train.rar
# !unrar x $HOST_DATA_DIR/Sits-train.rar $HOST_DATA_DIR/train_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Sits-test.rar
# !unrar x $HOST_DATA_DIR/Sits-test.rar $HOST_DATA_DIR/test_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Walk-train.rar
# !unrar x $HOST_DATA_DIR/Walk-train.rar $HOST_DATA_DIR/train_raw
# !wget -P $HOST_DATA_DIR https://best.sjtu.edu.cn/Assets/userfiles/sys_eb538c1c-65ff-4e82-8e6a-a1ef01127fed/files/ZIP/Walk-test.rar
# !unrar x $HOST_DATA_DIR/Walk-test.rar $HOST_DATA_DIR/test_raw
OPTIONAL
Run the process script for SHAD.
IMPORTANT NOTE
: to run the process_SHAD.sh
generating optical flow, a Turing or Ampere above GPU is needed. You could run with preprocess_SHAD_RGB.sh
to play with RGB frames only
# # ! cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && ./preprocess_SHAD.sh $HOST_DATA_DIR/train_raw $HOST_DATA_DIR/train
# ! cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && ./preprocess_SHAD_RGB.sh $HOST_DATA_DIR/train_raw $HOST_DATA_DIR/train
# # ! cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && ./preprocess_SHAD.sh $HOST_DATA_DIR/test_raw $HOST_DATA_DIR/test
# ! cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && ./preprocess_SHAD_RGB.sh $HOST_DATA_DIR/test_raw $HOST_DATA_DIR/test
# # verify
# !ls -l $HOST_DATA_DIR/train
# !ls -l $HOST_DATA_DIR/train/bend
# !ls -l $HOST_DATA_DIR/test
# !ls -l $HOST_DATA_DIR/test/bend
We will use NGC CLI to get the pre-trained models. For more details, go to https://ngc.nvidia.com and click the SETUP on the navigation bar.
# Installing NGC CLI on the local machine.
## Download and install
import os
%env CLI=ngccli_cat_linux.zip
!mkdir -p $HOST_RESULTS_DIR/ngccli
# Remove any previously existing CLI installations
!rm -rf $HOST_RESULTS_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $HOST_RESULTS_DIR/ngccli
!unzip -u "$HOST_RESULTS_DIR/ngccli/$CLI" -d $HOST_RESULTS_DIR/ngccli/
!rm $HOST_RESULTS_DIR/ngccli/*.zip
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("HOST_RESULTS_DIR", ""), os.getenv("PATH", ""))
!ngc registry model list nvidia/tao/actionrecognitionnet:*
!mkdir -p $HOST_RESULTS_DIR/pretrained
# Pull pretrained model from NGC
!ngc registry model download-version "nvidia/tao/actionrecognitionnet:trainable_v1.0" --dest $HOST_RESULTS_DIR/pretrained
print("Check that model is downloaded into dir.")
!ls -l $HOST_RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v1.0
We provide specification files to configure the training parameters including:
Please refer to the TAO documentation about ActionRecognitionNet to get all the parameters that are configurable.
!cat $HOST_SPECS_DIR/train_rgb_3d_finetune.yaml
# NOTE: The following paths are set from the perspective of the TAO Docker.
# The data is saved here
%env DATA_DIR = /data
%env SPECS_DIR = /specs
%env RESULTS_DIR = /results
We provide pretrained RGB-only model trained on HMDB5 dataset. With the pretrained model, we can even get better accuracy with less epochs.
KNOWN ISSUE
:
The training log will be corrupted by pytorch warning in the notebook:
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
To see the full log in std out, please run the command in terminal.
print("Train RGB only model with PTM")
!tao action_recognition train \
-e $SPECS_DIR/train_rgb_3d_finetune.yaml \
-r $RESULTS_DIR/rgb_3d_ptm \
-k $KEY \
model_config.rgb_pretrained_model_path=$RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v1.0/resnet18_3d_rgb_hmdb5_32.tlt \
model_config.rgb_pretrained_num_classes=5
print('Encrypted checkpoints:')
print('---------------------')
!ls -ltrh $HOST_RESULTS_DIR/rgb_3d_ptm
print('Rename a model: Note that the training is not deterministic, so you may change the model name accordingly.')
print('---------------------')
# NOTE: The following command may require `sudo`. You can run the command outside the notebook.
!mv $HOST_RESULTS_DIR/rgb_3d_ptm/ar_model_epoch=19-val_loss=0.05.tlt $HOST_RESULTS_DIR/rgb_3d_ptm/rgb_only_model.tlt
!ls -ltrh $HOST_RESULTS_DIR/rgb_3d_ptm/rgb_only_model.tlt
OPTIONAL
4.2 Train 2D modelImportant Note
The following cells are using SHAD dataset.
Firstly, we will train a 2D RGB-only model from scratch
# print("Train RGB only model from scratch")
# !tao action_recognition train \
# -e $SPECS_DIR/train_rgb_2d.yaml \
# -r $RESULTS_DIR/rgb_2d \
# -k $KEY \
# dataset_config.train_dataset_dir=$DATA_DIR/train \
# dataset_config.val_dataset_dir=$DATA_DIR/test
# print("To resume training from a checkpoint, set the resume_training_checkpoint_path option to be the .tlt you want to resume from")
# print("remember to remove the `=` in the checkpoint's file name")
# !tao action_recognition train \
# -e $SPECS_DIR/train_rgb_2d.yaml \
# -r $RESULTS_DIR/rgb_2d \
# -k $KEY \
# resume_training_checkpoint_path=
# print('Encrypted checkpoints:')
# print('---------------------')
# !ls -ltrh $HOST_RESULTS_DIR/rgb_2d
# print('Rename a model: ')
# print('---------------------')
# !echo <passwd> | sudo -S mv $HOST_RESULTS_DIR/rgb_2d/ar_model_epoch=21-val_loss=0.88.tlt $HOST_RESULTS_DIR/rgb_2d/rgb_only_model.tlt
# !ls -ltrh $HOST_RESULTS_DIR/rgb_2d/rgb_only_model.tlt
Next, we will train a 2D optical flow only model from scratch
# print("Train optical flow only model from scratch")
# !tao action_recognition train \
# -e $SPECS_DIR/train_of_2d.yaml \
# -r $RESULTS_DIR/of_2d \
# -k $KEY \
# dataset_config.train_dataset_dir=$DATA_DIR/train \
# dataset_config.val_dataset_dir=$DATA_DIR/test
# print('Encrypted checkpoints:')
# print('---------------------')
# !ls -ltrh $HOST_RESULTS_DIR/of_2d
# print('Rename a model: ')
# print('---------------------')
# !echo <passwd> | sudo -S mv $HOST_RESULTS_DIR/of_2d/ar_model_epoch=29-val_loss=0.94.tlt $HOST_RESULTS_DIR/of_2d/of_only_model.tlt
# !ls -ltrh $HOST_RESULTS_DIR/of_2d/of_only_model.tlt
Finally, we will train a 2D joint model which consumed both RGB frames and optical flow frames based on two pretrained single stream model
# print("Train joint model based on RGB and OF model")
# !tao action_recognition train \
# -e $SPECS_DIR/train_joint_2d.yaml \
# -r $RESULTS_DIR/joint_2d \
# -k $KEY \
# model_config.rgb_pretrained_model_path=$RESULTS_DIR/rgb_2d/rgb_only_model.tlt \
# model_config.of_pretrained_model_path=$RESULTS_DIR/of_2d/of_only_model.tlt \
# dataset_config.train_dataset_dir=$DATA_DIR/train \
# dataset_config.val_dataset_dir=$DATA_DIR/test
# print('Encrypted checkpoints:')
# print('---------------------')
# !ls -ltrh $HOST_RESULTS_DIR/joint_2d
# print('Rename a model: ')
# print('---------------------')
# !echo <passwd> | sudo -S mv $HOST_RESULTS_DIR/joint_2d/ar_model_epoch=16-val_loss=0.60.tlt $HOST_RESULTS_DIR/joint_2d/joint_model.tlt
# !ls -ltrh $HOST_RESULTS_DIR/joint_2d/joint_model.tlt
We provide two different sample strategy to evaluate the pretrained model on video clips.
center
mode: pick up the middle frames of a sequence to do inference. For example, if the model requires 32 frames as input and a video clip has 128 frames, then we will choose the frames from index 48 to index 79 to do the inference. conv
mode: convolutionly sample 10 sequences out of a single video and do inference. The final results are averaged.Evaluate RGB model trained with PTM
!tao action_recognition evaluate \
-e $SPECS_DIR/evaluate_rgb.yaml \
-k $KEY \
model=$RESULTS_DIR/rgb_3d_ptm/rgb_only_model.tlt \
batch_size=1 \
test_dataset_dir=$DATA_DIR/test \
video_eval_mode=center
In this section, we run the action recognition inference tool to generate inferences with the trained RGB models and print the results.
There are also two modes for inference just like evaluation: center
mode and conv
mode. And the final output will show each input sequence label in the videos like:
[video_sample_path] [labels list for sequences in the video sample]
!tao action_recognition inference \
-e $SPECS_DIR/infer_rgb.yaml \
-k $KEY \
model=$RESULTS_DIR/rgb_3d_ptm/rgb_only_model.tlt \
inference_dataset_dir=$DATA_DIR/test/ride_bike \
video_inf_mode=center
!mkdir -p $HOST_RESULTS_DIR/export
# Export the RGB model to encrypted ONNX model
!tao action_recognition export \
-e $SPECS_DIR/export_rgb.yaml \
-k $KEY \
model=$RESULTS_DIR/rgb_3d_ptm/rgb_only_model.tlt\
output_file=$RESULTS_DIR/export/rgb_resnet18_3.etlt
print('Exported model:')
print('------------')
!ls -lth $HOST_RESULTS_DIR/export
This notebook has come to an end. You may continue by deploying this model to DeepStream