NGC | Catalog

LPDNet

For downloads and more information, please view on a desktop device.
Logo for LPDNet

Description

Object Detection network to detect license plates in an image of a car.

Publisher

NVIDIA

Use Case

Object Detection

Framework

Transfer Learning Toolkit

Latest Version

pruned_v2.1

Modified

May 25, 2022

Size

3.79 MB

License Plate Detection (LPDNet) Model Card

Model Overview

The models described in this card detect one or more license plate objects from a car image and return a box around each object, as well as an lpd label for each object. TAO Toolkit provides two kinds of pretrained LPD models: one is based on the DetectNet_v2 (version 1.0) network; the other is based on the YOLOv4-tiny (version 2.0) network.

For Detectnet_v2 based model (version 1.0), there are two versions --- one is trained on a NVIDIA-owned US license plate dataset and another is trained on a public Chinese City Parking Dataset(CCPD).

For YOLOv4-tiny based model (version 2.0), it is trained on a NVIDIA-owned US license plate dataset.

Model Architecture

Detectnet_v2 models are based on NVIDIA DetectNet_v2 detector with ResNet18 as feature extractor. This architecture, also known as GridBox object detection, uses bounding-box regression on a uniform grid on the input image. Gridbox system divides an input image into a grid which predicts four normalized bounding-box parameters (xc, yc, w, h) and confidence value per output class.The raw normalized bounding-box and confidence detections needs to be post-processed by a clustering algorithm such as DBSCAN or NMS to produce final bounding-box coordinates and category labels.

YOLOv4-tiny models are based on YOLOv4-tiny detector with cspdarknet_tiny as feature extractor.

Training

The training algorithm optimizes the network to minimize the localization and confidence loss for the objects.

Training Data

The US license plate models are trained on a proprietary dataset with over 45000 US car images.

The Chinese license plates is trained on a public dataset CCPD (Chinese City Parking dataset) with about 172000 images. All images are taken manually by workers of a roadside parking management company in the streets of a provincial capital of China. The details of this dataset can be found in "Towards end-to-end license plate detection and recognition: A large dataset and baseline."(ECCV 2018)

Performance

Evaluation Data

The evaluation dataset for US LPDNet is obtained through the same way as training dataset. The images are picked from the raw images manually to be diversed at different angles, illumination and sharpness. The evaluation dataset for Chinese LPDNet includes 14% of the images in CCPD-Base(the base sub-dataset in CCPD).

Methodology and KPI

The key performance indicator is the accuracy of license plate detection.The KPI for the evaluation data are reported in the table below.

Version 1.0 models (DetectNet_v2)

Model Dataset Accuracy
usa_unpruned_model NVIDIA 3k LPD eval dataset 98.58%
usa_pruned_model NVIDIA 3k LPD eval dataset 98.46%
ccpd_unpruned_model 14% of CCPD-Base dataset 99.24%
ccpd_pruned_model 14% of CCPD-Base dataset 99.22%

Version 2.0 models (YoloV4-Tiny)

Model Dataset Accuracy
yolov4_tiny_usa_trainable_model NVIDIA 3k LPD eval dataset 99.53%
yolov4_tiny_usa_deployable_model NVIDIA 3k LPD eval dataset 99.61%

Version 2.0 models based on YoloV4-Tiny provides accuracy improvement over the version 1.0 models

Real-time Inference Performance

The inference is run on the provided pruned(deployable) models at INT8 precision. On the Jetson Nano FP16 precision is used. The inference performance runs with trtexec on Jetson Nano, Jetson TX2, AGX Xavier, Xavier NX and NVIDIA T4 GPU. The Jetson devices run at Max-N configuration for maximum system performance. The performance shown below is only for inference of the usa pruned(deployable) model. The end-to-end performance with streaming video data might slightly vary depending on use cases of applications.

Version 2.0 (YoloV4-tiny) models provide higher accuracy for different use cases, but comes at a cost of inference performance. The inference performance for v2.0 model is lower than v1.0 model. For version 1.0 (DetectNet_v2) based models,

Device Precision Batch_size FPS
Nano FP16 1 66
TX2 INT8 1 187
NX INT8 1 461
Xavier INT8 1 913
T4 INT8 1 2748

For Version 2.0 (YOLOv4-tiny) based models,

Device Precision Batch_size FPS
Nano FP16 1 40
TX2 INT8 1 105
NX INT8 1 225
Xavier INT8 1 489
T4 INT8 1 1264

How to use Version 1.0 (Detectnet_v2) based model

These models need to be used with NVIDIA Hardware and Software. For Hardware, the models can run on any NVIDIA GPU including NVIDIA Jetson devices. These models can only be used with Train Adapt Optimize (TAO) Toolkit, DeepStream SDK or TensorRT.

For Version 1.0 (Detectnet_v2) based model, there are four models provided:

  • usa_unpruned.tlt
  • usa_pruned.etlt
  • ccpd_unpruned.tlt
  • ccpd_pruned.etlt

The unpruned models are intended for training and fine-tuning using TAO Toolkit along with the user's dataset of license plates in United States of America or China. High fidelity models can be trained and adapted to the use case. The DetectNet_v2 Jupyter notebook available as a part of TAO CV resources can be used to re-train.

The usa pruned models are intended for easy deployment to the edge using DeepStream SDK or TensorRT. They accept 640x480x3 dimension input tensors and outputs 40x30x12 bbox coordinate tensor and 40x30x3 class confidence tensor.

The ccpd pruned models are intended for easy deployment to the edge using DeepStream SDK or TensorRT. These models accept 720x1168x3 dimension input tensors and outputs 45x73x12 bbox coordinate tensor and 45x73x3 class confidence tensor.

DeepStream provides a toolkit to create efficient video analytics pipelines to capture, decode, and pre-process the data before running inference. DeepStream will then post-process the output bbox coordinate tensor and class confidence tensors with NMS or DBScan clustering algorithm to create appropriate bounding boxes. The sample application and config file to run these models are provided in DeepStream SDK.

The unpruned and pruned models are encrypted and can be decrypted with the following key:

  • Model load key: nvidia_tlt

Please make sure to use this as the key for all TAO commands that require a model load key.

Model versions

  • unpruned_v1.0 - ResNet18 based pre-trained model. Intended for training.
  • pruned_v1.0 - ResNet18 deployment models. Contains calibration cache for GPU and DLA. DLA one is required if running inference on Jetson AGX Xavier or Xavier NX DLA.
  • unpruned_v2.0 - CSPDarkNet-Tiny based YOLOv4-Tiny pre-trained model. Intended for training.
  • pruned_v1.0 - CSPDarkNet-Tiny based YOLOv4-Tiny deployment model. Contains calibration cache for INT8 deployment on GPU and DLA.

Input

  • For US license plate

    • Color Images of resolution 640 X 480 X 3 (W x H x C)
    • Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (480), W = Width of the images (640)
    • Input scale: 1/255.0
    • Mean subtraction: None
  • For Chinese license plate: Color Images of resolution 720 X 1168 X 3

    • Color Images of resolution 720 X 1168 X 3 (W x H x C)
    • Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (1168), W = Width of the images (720)
    • Input scale: 1/255.0
    • Mean subtraction: None

Output

Category labels (lpd) and bounding-box coordinates for each detected license plate in the input image.

Instructions to use version 1.0 unpruned model with TAO

In order to use these models as a pretrained weights for transfer learning, please use the snippet below as template for the model_config component of the experiment spec file to train a DetectNet_v2 model. For more information on the experiment spec file, please refer to the TAO Toolkit DetectNet_v2 Guide.

  1. For ResNet18
model_config {
  num_layers: 18
  pretrained_model_file: "/path/to/the/model.tlt"
  use_batch_norm: true
  objective_set {
    bbox {
      scale: 35.0
      offset: 0.5
    }
    cov {
    }
  }
  training_precision {
    backend_floatx: FLOAT32
  }
  arch: "resnet"
}

Instructions to deploy version 1.0 models with DeepStream

To create the entire end-to-end video analytics application, deploy these models with DeepStream SDK. DeepStream SDK is a streaming analytics toolkit to accelerate building AI-based video analytics applications. DeepStream supports direct integration of these models into the deepstream sample app.

To deploy these models with DeepStream 6.0, please follow the instructions below:

Download and install DeepStream SDK. The installation instructions for DeepStream are provided in DeepStream development guide. The config files for the purpose-built models are located in:

/opt/nvidia/deepstream/deepstream-6.0/samples/configs/tao_pretrained_models

/opt/nvidia/deepstream is the default DeepStream installation directory. This path will be different if you are installing in a different directory.

You need to create 1 label file and 2 config files.

labels_lpdnet.txt - Label file with 1 class
deepstream_app_source1_trafficcamnet_lpdnet.txt - Main config file for DeepStream app
config_infer_secondary_lpdnet.txt - File to configure inference settings 

Create label file labels_lpdnet.txt

echo lpd > labels_lpdnet.txt

Create config file deepstream_app_source1_trafficcamnet_lpdnet.txt

cp deepstream_app_source1_trafficcamnet.txt deepstream_app_source1_trafficcamnet_lpdnet.txt

Modify config file deepstream_app_source1_trafficcamnet_lpdnet.txt. Add below lines in it.

[secondary-gie0]
enable=1
model-engine-file=usa_pruned.etlt_b4_gpu0_int8.engine
gpu-id=0
batch-size=4
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=config_infer_secondary_lpdnet.txt

Create config file config_infer_secondary_lpdnet.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
labelfile-path=<path to labels_lpdnet.txt>
tlt-encoded-model=<path to etao_model>
tlt-model-key=nvidia_tlt
int8-calib-file=<path to calibration cache>
uff-input-dims=3;480;640;0  #For us model, set to 3;480;640;0  For ccpd model, set to 3;1168;720;0
uff-input-blob-name=input_1
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=1
##1 Primary 2 Secondary
process-mode=2
interval=0
gie-unique-id=2
#0 detector 1 classifier 2 segmentatio 3 instance segmentation
network-type=0
operate-on-gie-id=1
operate-on-class-ids=0
cluster-mode=3
output-blob-names=output_cov/Sigmoid;output_bbox/BiasAdd
input-object-min-height=30
input-object-min-width=40
#enable-dla=1

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Run deepstream-app:

deepstream-app -c deepstream_app_source1_trafficcamnet_lpdnet.txt

Documentation to deploy with DeepStream is provided in "Deploying to DeepStream" chapter of TAO User Guide.

How to use Version 2.0 (YOLOv4-tiny) based model

These models need to be used with NVIDIA Hardware and Software. For Hardware, the models can run on any NVIDIA GPU including NVIDIA Jetson devices. These models can only be used with Train Adapt Optimize (TAO) Toolkit, DeepStream SDK or TensorRT.

Primary use case intended for these models is detecting license plates in a color (RGB) image. The model can be used to detect license plates from photos and videos by using appropriate video or image decoding and pre-processing.

For YOLOv4-tiny based model, totally there are two models provided:

  • yolov4_tiny_usa_trainable.tlt
  • yolov4_tiny_usa_deployable.etlt

The unpruned models are intended for training and fine-tuning using TAO Toolkit along with the user's dataset of license plates in United States of America or China. High fidelity models can be trained and adapted to the use case. The YoloV4-Tiny Jupyter notebook available as a part of TAO CV resources can be used to re-train.

The usa deployable models are intended for easy deployment to the edge using DeepStream SDK or TensorRT.

The trainable and deployable models are encrypted and can be decrypted with the following key:

  • Model load key: nvidia_tlt

Please make sure to use this as the key for all TAO commands that require a model load key.

Instructions to use version 2.0 trainable model with TAO

In order to use these models as a pretrained weights for transfer learning, please use the snippet below as template for the training_config component of the experiment spec file to train a YOLOv4-tiny model. For more information on the experiment spec file, please refer to the TAO Toolkit YoloV4 Tiny Guide.

  1. Set pretrain_model_path in training_config section
training_config {
  pretrain_model_path: "/path/to/the/yolov4_tiny_usa_trainable.tlt"
  ...
}

Instructions to deploy version 2.0 models with DeepStream

To create the entire end-to-end video analytics application, deploy these models with DeepStream SDK. DeepStream SDK is a streaming analytics toolkit to accelerate building AI-based video analytics applications. DeepStream supports direct integration of these models into the deepstream sample app.

To deploy these models with DeepStream 6.0, please follow the instructions below:

Download and install DeepStream SDK. The installation instructions for DeepStream are provided in DeepStream development guide. The config files for the purpose-built models are located in:

/opt/nvidia/deepstream/deepstream-6.0/samples/configs/tao_pretrained_models

/opt/nvidia/deepstream is the default DeepStream installation directory. This path will be different if you are installing in a different directory.

You need to create 1 label file and 2 config files.

labels_lpdnet.txt - Label file with 1 class
deepstream_app_source1_trafficcamnet_lpdnet.txt - Main config file for DeepStream app
config_infer_secondary_lpdnet.txt - File to configure inference settings

Create label file labels_lpdnet.txt

echo lpd > labels_lpdnet.txt

Create config file deepstream_app_source1_trafficcamnet_lpdnet.txt

cp deepstream_app_source1_trafficcamnet.txt deepstream_app_source1_trafficcamnet_lpdnet.txt

Modify config file deepstream_app_source1_trafficcamnet_lpdnet.txt. Add below lines in it.

[secondary-gie0]
enable=1
model-engine-file=yolov4_tiny_usa_deployable.etlt_b4_gpu0_int8.engine
gpu-id=0
batch-size=4
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=0;
config-file=config_infer_secondary_lpdnet.txt

Create config file config_infer_secondary_lpdnet.txt

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=<path to labels_lpdnet.txt>
tlt-encoded-model=<path to deployable_model>
tlt-model-key=nvidia_tlt
int8-calib-file=<path to calibration cache>
uff-input-dims=3;480;640;0
uff-input-blob-name=Input
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=1
##1 Primary 2 Secondary
process-mode=2
interval=0
gie-unique-id=2
#0 detector 1 classifier 2 segmentatio 3 instance segmentation
network-type=0
operate-on-gie-id=1
operate-on-class-ids=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=deepstream_tao_apps/post_processor/libnvds_infercustomparser_tao.so
#enable-dla=1

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

Build libnvds_infercustomparser_tao.so

git clone -b release/tao3.0  https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps.git
cd deepstream_tao_apps/post_processor/
export CUDA_VER=xy.z                  // xy.z is CUDA version, e.g. 11.4
make
cd -

Run deepstream-app:

deepstream-app -c deepstream_app_source1_trafficcamnet_lpdnet.txt

Documentation to deploy with DeepStream is provided in Deploying to DeepStream section of TAO User Guide.

Limitations

Aspect ratio of cropped car image

The LPD network for US license plates was trained on cropped car images with 640x480 resolution. Therefore, cropped car images which have aspect ratio of 4:3 may provide the expected detection results.

Occluded Cars

When cars are occluded or truncated too much, the license plate may not be detected by the LPDNet model.

Dark-lighting, Monochrome or Infrared Camera Images

The LPDNet model were trained on RGB images in good lighting conditions. Therefore, images captured in dark lighting conditions or a monochrome image or IR camera image may not provide good detection results.

Camera Positions

Assume camera sensor is in the camera coordinate center. The X-axis is horizontal and points to the right, the Y-axis is vertical and points up and the Z-axis points towards the outside. In this coordinate system, the LPD network may provide the expected detection results under following conditions:

  • Roll: within -30 degree to +30 degree
  • Pitch: within -30 degree to +30 degree
  • Yaw: within -15 degree to +15 degree
  • Distance to license plate: not too far away, so that the license plate in images are larger than 16x16 pixels

Restricted usage in different regions

NVIDIA LPDNet model for US is trained on license plates collected in California. So for license plates in other states, the model will not be expected to reach the same level of accuracy as in California. NVIDIA LPDNet model for Chinese model is trained on license plates collected in Anhui province.

In general, to get better accuracy in a region other than US-California / China-Anhui in pretrain dataset, more data is needed in this region to finetune the pretrained model through TAO Toolkit.

Using TAO Pre-trained Models

License

License to use these models is covered by the Model EULA. By downloading the unpruned or pruned version of the model, you accept the terms and conditions of these licenses.

Technical blogs

Suggested reading

Ethical AI

NVIDIA LPDNet model detects license plates.

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.