OCR: Part 3 — OCR using OpenCV and CNN

3 min readJan 3, 2019

In the last parts(Part 1, Part 2), we saw how to recognize a random string in an image using CNN only. We trained the model using end to end approach but that approach is not good enough to build a useful application like reading text from a given screenshot. In this part, we will make use of what we have learned in the last two parts and image segmentation using openCV to build a pipeline for reading text from given screenshots. Our final application should give output similar to the image given below:

Approach:

Now our approach will not be to recognize everything in an image at one shot but we will first segment the image into characters and then we will pass these segmented characters through CNN to recognize and finally we will arrange the recognized characters to reproduce the text seen in the image. The approach is described below using the diagram:

In order to achieve this we need to have the following pieces of code:

Image segmentation
Segmented character data generation
CNN model to train character classification

Once we will have above three pieces of code we could combine all three to read text from the given images. Source code for this approach is available HERE.

Dataset preparation:

In this project dataset contains the following characters:

“0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z A B D E F G H J K L M N Q R T Y . ( ) %”

In total 56 different characters are considered for the dataset. Uppercase characters which look like its lowercase has been discarded. The dataset contains each character in 15 different fonts, 7 different orientations, and 5 different font thickness.

Size of the dataset = 56*15*7*5 = 29400

Training:

The architecture of the model used for classification is given in the diagram below: