NGC | Catalog
Welcome Guest
CatalogModelsGPUNet-D1 pretrained weights (PyTorch, AMP, ImageNet)

GPUNet-D1 pretrained weights (PyTorch, AMP, ImageNet)

For downloads and more information, please view on a desktop device.
Logo for GPUNet-D1 pretrained weights (PyTorch, AMP, ImageNet)

Description

GPUNet-D1 weights pretrained on ImageNet

Publisher

NVIDIA Deep Learning Examples

Use Case

Classification

Framework

PyTorch

Latest Version

21.12.0_amp

Modified

May 12, 2022

Size

162.5 MB

Model Overview

GPUNet is a new family of Convolutional Neural Networks crafted by NVIDIA AI.

Model Architecture

The above table describes the general structure of GPUNet, which consists of 8 stages, and we search for the configurations of each stage. The layers within a stage share the same configurations. The first two stages are to search for the head configurations using convolutions. Inspired by EfficientNet-V2, the 2 and 3 stages use Fused Inverted Residual Blocks(IRB); however, we observed the increasing latency after replacing the rest IRB with Fused-IRB. Therefore, from stages 4 to 7, we use IRB as the primary layer. The column #Layers shows the range of #Layers in the stage, for example, [3, 10] at stage 4 means that the stage can have three to 10 IRBs, and the column filters shows the range of filters for the layers in the stage. We also tuned the expansion ratio, activation types, kernel sizes, and the Squeeze Excitation(SE) layer inside the IRB/Fused-IRB. Finally, the dimensions of the input image increased from 224 to 512 at step 32.

GPUNet has provided seven specific model architectures at different latencies. You can easily query the architecture details from the JSON formatted model (for example, those in eval.py). The following figure describes GPUNet-0, GPUNet-1, and GPUNet-2 in the paper. Note that only the first IRB's stride is two and the stride of the rest IRBs is 1 in stages 2, 3, 4, and 6.

Training

This model was trained using script available in GitHub repo.

Dataset

The following datasets were used to train this model:

  • ImageNet - Image database organized according to the WordNet hierarchy, in which each noun is depicted by hundreds and thousands of images.

Performance

Performance numbers for this model are available in GitHub readme performance section.

References

License

This model was trained using open-source software available in Deep Learning Examples repository. For terms of use, please refer to the license of the script and the datasets the model was derived from.