GPUNet-0 pretrained weights (PyTorch, AMP, ImageNet)

GPUNet-0 pretrained weights (PyTorch, AMP, ImageNet)

Logo for GPUNet-0 pretrained weights (PyTorch, AMP, ImageNet)
GPUNet-0 ImageNet pretrained weights
NVIDIA Deep Learning Examples
Latest Version
April 4, 2023
181.85 MB

Model Overview

GPUNet is a new family of Convolutional Neural Networks crafted by NVIDIA AI.

Model Architecture

The above table describes the general structure of GPUNet, which consists of 8 stages, and we search for the configurations of each stage. The layers within a stage share the same configurations. The first two stages are to search for the head configurations using convolutions. Inspired by EfficientNet-V2, the 2 and 3 stages use Fused Inverted Residual Blocks(IRB); however, we observed the increasing latency after replacing the rest IRB with Fused-IRB. Therefore, from stages 4 to 7, we use IRB as the primary layer. The column #Layers shows the range of #Layers in the stage, for example, [3, 10] at stage 4 means that the stage can have three to 10 IRBs, and the column filters shows the range of filters for the layers in the stage. We also tuned the expansion ratio, activation types, kernel sizes, and the Squeeze Excitation(SE) layer inside the IRB/Fused-IRB. Finally, the dimensions of the input image increased from 224 to 512 at step 32.

GPUNet has provided seven specific model architectures at different latencies. You can easily query the architecture details from the JSON formatted model (for example, those in The following figure describes GPUNet-0, GPUNet-1, and GPUNet-2 in the paper. Note that only the first IRB's stride is two and the stride of the rest IRBs is 1 in stages 2, 3, 4, and 6.


This model was trained using script available in GitHub repo.


The following datasets were used to train this model:

  • ImageNet - Image database organized according to the WordNet hierarchy, in which each noun is depicted by hundreds and thousands of images.


Performance numbers for this model are available in GitHub readme performance section.



This model was trained using open-source software available in Deep Learning Examples repository. For terms of use, please refer to the license of the script and the datasets the model was derived from.