With a ResNet-50 backbone and a number of architectural modifications, this version provides better accuracy and performance.
Despite the changes described in the previous section, the overall architecture, as described in the following diagram, has not changed.
Figure 1. The architecture of a Single Shot MultiBox Detector model. Image has been taken from the Single Shot MultiBox Detector paper.
The backbone is followed by 5 additional convolutional layers. In addition to the convolutional layers, we attached 6 detection heads:
This model was trained using script available on NGC and in GitHub repo.
The following datasets were used to train this model:
Performance numbers for this model are available in NGC.
This model was trained using open-source software available in Deep Learning Examples repository. For terms of use, please refer to the license of the script and the datasets the model was derived from.