Object detection is a popular computer vision technique that can detect one or multiple objects in a frame and place bounding boxes around them. This sample Jupyter notebook provided here, contains a ResNet18 model that you can retrain on an AzureML Compute Target
This is a pretrained RestNet18 model built on a DetecNet v2.0 backbone for object detection that has been trained on the object detection and object orientation estimation benchmark consisting of 7481 training images and 7518 test images, comprising a total of 80, 256 labeled objects.
In this Jupyter notebook, we will retrain at this model The model contains pretrained weights that can be used as a starting point with the DetectNet_v2 object detection networks. The ResNet18 model that is used in this notebook, is an unpruned model with just the feature extractor weights, and has to be re-trained from scratch for an object detection use-case.
A Mean Average Precision (mAP) of 74.57% was achieved after training the model for 120 epochs. Also shown below are the mAP for the various classes in the model:
Car: 78.62% Cyclist: 80.78% Pedestrian: 64.27% The data shown below is shown as a reference. Depending on the type of training data and additional classes that you add to the model, you may have to run this for more than 120 epochs to achieve the desired accuracy.
For more information about other models pretrained models that can be used as start point of detectnet_v2 please refer to this ngc page: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_detectnet_v2
To fine-tune and customize the model, you’ll be using the TAO (Train, Adapt and Optimize) Toolkit. The TAO Toolkit, a low-code AI model development solution, leverages the power of transfer learning to help fine-tune pretrained models with your own data. Transfer learning which is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and retrain to use it on a different task. With the TAO Toolkit, you can customize models for tasks in computer vision, natural language processing and speech.
Once you have customized the model, you can then use the built-in optimization techniques such as model pruning and quantization to optimize the model for inference on the target GPU, without sacrificing accuracy.
All the training steps are covered in the Jupyter notebook.
Note: A customized kernel for the Jupyter Notebook is used as the primary mechanism for deployment. This kernel has been built on the TAO Toolkit container.