Pre-trained DINO ImageNet weights

Pre-trained DINO ImageNet weights

Logo for Pre-trained DINO ImageNet weights
Description
Pre-trained DINO weights trained on ImageNet to facilitate transfer learning using TAO Toolkit.
Publisher
-
Latest Version
gcvit_large_imagenet22k_384
Modified
October 16, 2023
Size
824.03 MB

TAO Pretrained Non-commercial Backbone for DINO

What is Train Adapt Optimize (TAO) Toolkit?

Train Adapt Optimize (TAO) Toolkit is a Python-based AI toolkit for taking purpose-built pre-trained AI models and customizing them with your own data. TAO adapts popular network architectures and backbones to your data, allowing you to train, fine tune, prune, and export highly optimized and accurate AI models for edge deployment.

Pre-trained models accelerate the AI training process and reduce costs associated with large scale data collection, labeling, and training models from scratch. Transfer learning with pre-trained models can be used for AI applications in smart cities, retail, healthcare, industrial inspection, and more.

Build end-to-end services and solutions for transforming pixels and sensor data to actionable insights using TAO DeepStream SDK and TensorRT. These models are suitable for object detection, classification, and segmentation.

DINO Based Object Detection

Object detection is a popular computer vision technique that can detect one or multiple objects in a frame. Object detection will recognize the individual objects in an image and places bounding boxes around the object. This model card contains pretrained weights that may be used as a starting point with the DINO object detection networks in Train Adapt Optimize (TAO) Toolkit to facilitate transfer learning.

It is trained on the ImageNet-1K. Following backbones are supported with DINO networks.

Supported Backbone:

  • resnet_50
  • gc_vit_xxtiny / gc_vit_xtiny / gc_vit_tiny / gc_vit_small / gc_vit_base / gc_vit_large / gc_vit_large_384
  • fan_tiny / fan_small / fan_base / fan_large

Model Versions

  • gcvit_xxtiny_imagenet1k - ImageNet1K pre-trained GCViT-xxTiny model for finetune.
  • gcvit_xtiny_imagenet1k - ImageNet1K pre-trained GCViT-xTiny model for finetune.
  • gcvit_tiny_imagenet1k - ImageNet1K pre-trained GCViT-Tiny model for finetune.
  • gcvit_small_imagenet1k - ImageNet1K pre-trained GCViT-Small model for finetune.
  • gcvit_base_imagenet1k - ImageNet1K pre-trained GCViT-Base model for finetune.
  • gcvit_large_imagenet1k - ImageNet1K pre-trained GCViT-Large model for finetune.
  • gcvit_large_imagenet22k_384 - ImageNet22k pre-trained GCViT-Large model for finetune.
  • fan_hybrid_tiny - ImageNet1k pre-trained FAN-Hybrid-Tiny model for finetune. (224 resolution)
  • fan_hybrid_small - ImageNet1k pre-trained FAN-Hybrid-Small model for finetune. (224 resolution)
  • fan_hybrid_base_in22k - ImageNet22k pre-trained FAN-Hybrid-Small model for finetune. (224 resolution)
  • fan_hybrid_base_in22k_1k - ImageNet22K pre-trained FAN-Hybrid-Base model finetuned on Imagenet1k. (224 resolution)
  • fan_hybrid_base_in22k_1k_384 - ImageNet22K pre-trained FAN-Hybrid-Base model finetuned on ImageNet-1k.
  • fan_hybrid_large_in22k - ImageNet22k pre-trained FAN-Hybrid-Large model for finetune. (224 resolution)
  • fan_hybrid_large_in22k_384 - ImageNet22K pre-trained FAN-Hybrid-Large model for finetune.(384 resolution)
  • fan_hybrid_large_in22k_1k - ImageNet22K pre-trained FAN-Hybrid-Base model for finetune. (224 resolution)
  • fan_hybrid_large_in22k_1k_384 - ImageNet22K pre-trained FAN-Hybrid-Large model finetuned on Imagenet1k.(384 resolution)

Instructions to Use Pretrained Backbone Models with TAO

To use these models as pretrained backbone weights for transfer learning, use the snippet below as a template for the model and train component of the experiment spec file to train a DINO model. For more information on the experiment spec file, please refer to the TAO Toolkit User Guide.

model:
  pretrained_backbone_path: /path/to/the/resnet50.pth
  backbone: resnet_50
  train_backbone: True
  num_feature_levels: 4
  dec_layers: 6
  enc_layers: 6
  num_queries: 900
  dropout_ratio: 0.0
  dim_feedforward: 2048

Other TAO Pre-trained Models

License

This work is licensed under the Creative Commons Attribution NonCommercial ShareAlike 4.0 License (CC-BY-NC-SA-4.0). To view a copy of this license, please visit this link, or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Technical blogs

Suggested reading

Ethical AI

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.