License Plate Object Detection and Recognition using Deep Learning

Using Yolov3 and pytesseract for Humain 2019

Published in

Analytics Vidhya

7 min readNov 12, 2019

The huge blend of information developments, under different pieces of the cutting edge world, has incited the treatment of vehicles as calculated resources in information systems. Since a decision information structure has no significance without any data, there is a need to change vehicle information in the present world.

There are two approaches to make sense of this information, either with the help of human administrators or with the help of the latest technology that will help us to recognize evidence of vehicles by their license plates. Surveillance plays a vital role in almost every industry, whether it be construction sites, safety while driving for helmet detection, and also for the topic License Plate Detection which helps us to identify the unregistered plates.

Deep Learning has a promising future in the field of detection and identification through Computer Vision. Use of deep learning involves use of Convolutional Neural Network for image classification, Deep Neural Network, Recurrent Neural Network, etc. For the purpose of object detection, the possible approaches are R-CNN or Faster R-CNN and other pretrained models.

Scope of work

The project on License Plate Detection can help us identify violators of the traffic rules, especially at signals, exceeding a certain minimum speed near schools, etc. License Plate Detection in case of two wheelers, can be combined with helmet detection for possible drivers not wearing helmets while driving.However, the basic problem associated with Indian License Plate Detection problem is lack of good quality images taken from low resolution cameras.

Another problem is lack of defined or specific size of the number plates unlike foreign nations. While developing the solution of license plate detection, one of the aspects that came to my mind that a portion of the license plate, usually the portion above the character and digits, specifies some written information. For example, it specifies the designation of the person, in case of a higher ranking government official(see fig below) . We do not want to consider this portion and remove the unwanted written portion and retrieve only the numbers and characters on the license plate.

NOTE: THE IMAGE USED IS SOLELY FOR EXPLANATION PURPOSE AND NOT INTENDED TO USE ANYONE PRIVATE LICENSE PLATE NUMBER OR DESIGNATION

Indians can also specify the digits on the license plates in any random shape instead of only machine generated digits. As an example, figure below specifies the number “214” as a sequence of characters specifying name of Lord Ram in Hindi.

SOLUTION APPROACH

ALGORITHM/LIBRARY USED:

Json File preprocessing: The dataset is present in a json file specifying the content , label, image width, image height, and x and y coordinates of bottom right and top left bounding box. We have to separate all the fields and store them in csv format. For this , I have used the pandas library in python.
For Data Augmentation: Darknet enables us to perform data augmentation by changing the saturation, hue, brightness of images inside the yolo-tiny-obj.cfg which contains the architecture of our model. For more information please refer [2]
For Object detection: The main part in the case of object detection is to decide which of the models to use. There is a wide pool of models available to us, with variations of each model . The broad category or the widely used can be divided into YOLO(You Only Look Once), RetinaNet and SSD5 models.

From the figure below, the best models are YOLOv3 and RetinaNet.YOLOv3 has comparable Mean Average Precision(map) time with RetinaNet. Nevertheless, the inference time is smaller than RetinaNet. Hence, I have used the YOLOv3-tiny model for object detection.

Comparison of various models with respect to inference time[4]

4. For Optical Character Recognition: After getting the required bounding box on the license plate, we have to generate the string containing characters and digits. The following task is done using the library pytesseract library. For documentation refer [4].

IMPLEMENTATION DETAILS

PREPROCESSING STEPS

The data in json file is converted into a .csv format with columns for content, label, given coordinates of bounding box, image height, image width, notes and extras. It is important to note that the coordinates are in normalised form. First all the details into content, notes, extras, top left coordinates, etc and stored in a csv file. For this, run the seperate.py file.

After getting all the details in csv file, we are scraping the image from the web link. The images are saved incrementally as image(i).jpg where i is the value of i from 0 to size of data-1 and also add another column, that is the name of each image to the csv file. Run image_save.py

The top left and bottom right coordinates are stored in normalised form in the csv file. We convert them to their unnormalised form by multiplying the x coordinate with width and y coordinate with height of the image. These four columns are added to the dataset specifying the unnormalised coordinate values. This step is done as YOLO has a particular format for taking the input coordinates of the bounding box surronding the region of interest which is discussed in the next paragraph. Run bounding_box.py. To see the bounded box randomly on images, run draw_bounding_box.py

In YOLO, we have to create a .txt file for each image and should have the same name as the image. For each image, the txt file would contain five values — the class label, which is 0 in our case, the absolute center coordinates, x_center and y_center, and the absolute width and height of the bounding box with respect to the shape of the whole image

Orientation of annotated .txt file for YOLO

Run generating_annotations_test.py For example , bounding box of coordinates (582,274) and (700,321) with image width 806 and height 466 gets converted to :

SETTING DARKNET YOLOv3:

Clone the darknet YOLOv3 architecture from [2]. Execute the following commands in command prompt to setup darknet.

Copy images and their corresponding txt files in folder named obj under data. Create obj.names file specifying the class name. Under data folder, also create a obj.data file specifying the following:

Since we are dealing with only one class, we have to change the conf file specifying the YOLO architecture by setting filters at output YOLOv3 layer along with classes parameter. Set classes=1 and filters =(classes+5) *3 = 18 [2] in our case.

TRAINING AND TESTING

Separate the images into train and test sets in the format shown in fig. Run split_train_test.py

train.txt file snapshot stored in data folder

Run the following command to train model

Training object detection model

For Linux or Mac, replace darknet.exe with ./darknet

To test on images and get the coordinates of the region of interest, run predict.py with command line arguments as image_name, architecture .cfg file, trained weight file and classes file(obj.names)

Note : Give entire path to the weight and other file.

predict.py — image path_to_img — config yolov3-tiny-obj.cfg — weights yolov3-tiny.conv.15 — classes data/obj.names

GETTING THE CHARACTERS(OCR):

Once we know the value of the coordinated of the bounding box, crop the original image to only consist the license plate instead of the whole image using PIL library.