# TAO Pretrained EfficientDet ## Description: EfficientDet recognizes the individual objects in an image. This model is ready for commercial use. ## References: ### Citations - Tan, Mingxing, Ruoming Pang, and Quoc V. Le. "Efficientdet: Scalable and efficient object detection." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020. ### Other TAO Pre-trained Models - Get [TAO Object Detection](https://ngc.nvidia.com/catalog/models/nvidia:tao:pretrained_object_detection) pre-trained models for **YOLOV4, YOLOV3, FasterRCNN, SSD, DSSD, and RetinaNet** architectures from NGC model registry - Get [TAO EfficientDet Object Detection](https://ngc.nvidia.com/catalog/models/nvidia:tao:pretrained_efficientdet) pre-trained models for **DetectNet_v2** architecture from NGC model registry - Get [TAO classification](https://ngc.nvidia.com/catalog/models/nvidia:tao:pretrained_classification) pre-trained models from NGC model registry - Get [TAO Instance segmentation](https://ngc.nvidia.com/catalog/models/nvidia:tao:pretrained_instance_segmentation) pre-trained models for **MaskRCNN** architecture from NGC - Get [TAO Semantic segmentation](https://ngc.nvidia.com/catalog/models/nvidia:tao:pretrained_semantic_segmentation) pre-trained models for **UNet** architecture from NGC - Get Purpose-built models from NGC model registry: - [PeopleNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:peoplenet) - [TrafficCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:trafficcamnet) - [DashCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:dashcamnet) - [FaceDetectIR](https://ngc.nvidia.com/catalog/models/nvidia:tao:facedetectir) - [VehicleMakeNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:vehiclemakenet) - [VehicleTypeNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:vehicletypenet) - [PeopleSegNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:peoplesegnet) - [PeopleSemSegNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:peoplesemsegnet) - [License Plate Detection](https://ngc.nvidia.com/catalog/models/nvidia:tao:lpdnet) - [License Plate Recognition](https://ngc.nvidia.com/catalog/models/nvidia:tao:lprnet) - [Facial Landmark](https://ngc.nvidia.com/catalog/models/nvidia:tao:fpenet) - [FaceDetect](https://ngc.nvidia.com/catalog/models/nvidia:facenet) - [2D Body Pose Net](https://ngc.nvidia.com/catalog/models/nvidia:tao:bodyposenet) - [ActionRecognitionNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:actionrecognitionnet) ## Model Architecture: **Architecture Type:** Convolution Neural Network (CNN)
**Network Architecture:** [EfficientNet](https://arxiv.org/pdf/1905.11946.pdf)
The models in this instance are feature extractors based on the EfficientNet architecture. ## Input: **Input Type(s):** Image
**Input Format(s):** Red, Green, Blue (RGB)
**Input Parameters:** 3D
**Other Properties Related to Input:** RGB Fixed Resolution: 224 X 224 X 3 (W x H x C); No minimum bit depth, alpha, or gamma.
## Output: **Output Type(s):** Label(s), Bounding-Box(es), Confidence Scores
**Output Format:** Label: Text String(s); Bounding Box: (x-coordinate, y-coordinate, width, height), Confidence Scores: Floating Point
**Other Properties Related to Output:** Category Label(s): (Labels of object detected), Bounding Box Coordinates, Confidence Scores
## Software Integration: **Runtime Engine(s):** * DeepStream 6.1 or later
* TAO - 5.2
**Supported Hardware Architecture(s):**
* Ampere
* Jetson
* Hopper
* Lovelace
* Pascal
* Turing
**Supported Operating System(s):**
* Linux
* Linux 4 Tegra
## Model Version(s): The following efficientnet-x backbone versions are supported in TAO Toolkit: - efficientnet-b0 - efficientnet-b1 - efficientnet-b2 - efficientnet-b3 - efficientnet-b4 - efficientnet-b5 # Training & Evaluation: ## Training Dataset: **Link:** [https://github.com/openimages/dataset/blob/main/READMEV3.md](https://github.com/openimages/dataset/blob/main/READMEV3.md)
**Data Collection Method by dataset:**
* Unknown
**Labeling Method by dataset:**
* Unknown
**Properties:**
Roughly 400,000 images and 7,000 validation images across thousands of classes as defined by [Google OpenImages Version Three (3) dataset](https://github.com/openimages/dataset/blob/main/READMEV3.md). Most of the human verifications have been done with in-house annotators at Google. A smaller part has been done with crowd-sourced verification from Image Labeler: [Crowdsource app](http://play.google.com/store/apps/details?id=com.google.android.apps.village.boond), [g.co/imagelabeler](http://g.co/imagelabeler). ## Evaluation Dataset: **Link:** [https://github.com/openimages/dataset/blob/main/READMEV3.md](https://github.com/openimages/dataset/blob/main/READMEV3.md)
**Data Collection Method by dataset:**
* Unknown
**Labeling Method by dataset:**
* Unknown
**Properties:**
- 15,000 test images from [Google OpenImages Version Three (3) dataset](https://github.com/openimages/dataset/blob/main/READMEV3.md). ## Inference: **Engine:** Tensor(RT)
**Test Hardware:**
- Jetson AGX Xavier - Xavier NX - Orin - Orin NX - NVIDIA T4 - Ampere GPU - A2 - A30 - L4 - T4 - DGX H100 - DGX A100 - DGX H100 - L40 - JAO 64GB - Orin NX16GB - Orin Nano 8GB ## How to Use this Model ### Running EfficientDet Models Using TAO The EfficientDet app in TAO expect data in COCO format. TAO provides a simple command line interface to train a deep learning model for object detection. The models in this model area are only compatible with TAO Toolkit. For more information about the TAO container, please visit the TAO container page. 1. Install the NGC CLI from ngc.nvidia.com 2. Configure the NGC CLI using the following command ```sh ngc config set ``` 3. To view all the backbones that are supported by object detection architecture in TAO: ```sh ngc registry model list nvidia/tao/pretrained_efficientdet:* ``` 4. To download the model: ```sh ngc registry model download-version nvidia/tao/pretrained_efficientdet: