LPRNet recognizes characters in license plates from cropped images. This model is ready for commercial use.
Architecture Type: Convolution Neural Network (CNN)
Network Architecture: ResNet
Input Type(s): Image
Input Format(s): Red, Green, Blue (RGB)
Input Parameters: 3D
Other Properties Related to Input: RGB Fixed Resolution: 3 X 48 X 96 (C H W) Channel Ordering of the Input: CHW, where C = number of channels (3), H = Height of images (48), W = Width of the images (96); No minimum bit depth, alpha, or gamma.
Output Type(s): Label(s)
Output Format: Label: Text String
Other Properties Related to Output: Category Label(s): license plate alpha-numeric characters
Runtime Engine(s):
Supported Hardware Architecture(s):
Supported Operating System(s):
Data Collection Method by dataset:
Labeling Method by dataset:
Properties:
LPRNet model has been trained on two datasets:
Characters distribution:
character | number |
---|---|
0 | 100688 |
1 | 117499 |
2 | 98599 |
3 | 111220 |
4 | 127387 |
5 | 148325 |
6 | 175541 |
7 | 231298 |
8 | 105170 |
9 | 111234 |
A | 36350 |
B | 33677 |
C | 40292 |
D | 39447 |
E | 36787 |
F | 34734 |
G | 40474 |
H | 38751 |
I | 12645 |
J | 34155 |
K | 37397 |
L | 40900 |
M | 36544 |
N | 40431 |
P | 38198 |
Q | 11086 |
R | 40899 |
S | 38820 |
T | 41155 |
U | 45471 |
V | 35998 |
W | 38096 |
X | 37468 |
Y | 34454 |
Z | 31963 |
Illumination: sunny, cloudy, rainy, bright, dim.
Locations of dataset collection: US roads and parking lots, mainly in California.
Camera mounting location: mainly the dash camera and the side camera in cars.
Camera angles: Assume camera sensor is in the camera coordinate center. The X-axis is horizontal and points to the right, the Y-axis is vertical and points up and the Z-axis points towards the outside. In this coordinate system, the license plates in following position are choosen:
License plates images shapes:
min | max | avg | |
---|---|---|---|
height | 17 | 1924 | 54 |
width | 35 | 3896 | 109 |
aspect-ratio (width/height) | 0.9 | 3.6 | 2.0 |
Some sample images (before cropping the license plates) can be found in output annotated images
section of LPD's model card.
The data format must be in the following format.
/Dataset_01
/images
0000.jpg
0001.jpg
0002.jpg
...
...
...
N.jpg
/labels
0000.txt
0001.txt
0002.txt
...
...
...
N.txt
/characters_list.txt
Each cropped license plate image has a corresponding label text file which contains one line of characters in the specific license plate. There is a characters_list.txt
which has all the characters found in license plate dataset. Each character takes one line.
Data Collection Method by dataset:
Labeling Method by dataset:
Properties:
Model evaluated on approximately 100,000 images from Chinese City Parking Dataset (CCPD) of a provincial capital of China.
The key performance indicator is the accuracy of license plate recognition. The accurate recognition means all the characters in a license plate are recognized correctly.The KPI for the evaluation data are reported below.
model | dataset | accuracy |
---|---|---|
us_lprnet_baseline18_unpruned | NVIDIA LPR eval dataset | 97.49% |
ch_lprnet_baseline18_unpruned | CCPD_base_val | 99.67% |
Engine: Tensor(RT)
Test Hardware:
The inference uses FP16 precision. The inference performance runs with trtexec
on Jetson Nano, Xavier NX, AGX Xavier and NVIDIA T4 GPU. The Jetson devices run at Max-N configuration for maximum system performance. The data is the inference only performance. The end-to-end performance with streaming video data might slightly vary depending on use cases of applications.
Device | precision | batch_size | FPS |
---|---|---|---|
Jetson Nano | FP16 | 32 | 16 |
Jetson NX | FP16 | 32 | 600 |
Jetson Xavier | FP16 | 64 | 1021 |
T4 | FP16 | 128 | 3821 |
This model needs to be used with NVIDIA Hardware and Software. For Hardware, the model can run on any NVIDIA GPU including NVIDIA Jetson devices. This model can only be used with Transfer Learning Toolkit (TLT), DeepStream SDK or TensorRT.
Primary use case intended for this model is to recognize the license plate from the cropped RGB license plate image.
There are two models provided:
They are intended for training and fine-tune using Transfer Learning Toolkit and the users' dataset of license plates in United States of America or China. High fidelity models can be trained to the new use cases. The Jupyter notebook available as a part of TLT container can be used to re-train.
These models are also intended for easy deployment to the edge using DeepStream SDK or TensorRT. They accept 3x48x96
dimension input tensors and output the predicted sequence characters id. DeepStream provides facility to create efficient video analytic pipelines to capture, decode and pre-process the data before running inference.
The models are encrypted and can be decrypted with the following key:
nvidia_tlt
Please make sure to use this as the key for all TLT commands that require a model load key.
To create the entire end-to-end video analytic application, deploy this model with DeepStream SDK. DeepStream SDK is a streaming analytic toolkit to accelerate building AI-based video analytic applications. DeepStream supports direct integration of this model into the deepstream sample app.
To deploy this model with DeepStream 5.1, please follow the instructions in this repository.
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for this model, please see the Model Card++ Promise and the Explainability, Bias, Safety & Security, and Privacy Subcards.