4 Simple steps in building OCR

Optical character recognition (OCR) is process of classifying optical patterns contained in a digital image. The character recognition is achieved through segmentation, feature extraction and classification.

OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.

In OCR processing, the scanned-in image or bitmap is analyzed for light and dark areas in order to identify each alphabetic letter or numeric digit. When a character is recognized, it is converted into an ASCII code. Special circuit boards and computer chips designed expressly for OCR are used to speed up the recognition process.

Steps involved in Optical Character recognition:-

Steps in Optical Character Recognition :-
1) Extraction of Character boundaries from Image,
2) Building a Convolutional Neural Network(ConvNet) in remembering the Character images,
3) Loading trained Convolutional Neural Network(ConvNet) Model,
4) Consolidating ConvNet predictions of characters

The Algorithm is built in a way to segment each individual character in a Image as individual images :-) , followed by recognition and consolidation to text in an Image.

1) Optical Scanning ✂️ from Image :

  • Select any document or letter of having text information
  • Extract Character boundaries: Contours can be explained simply as a curve joining all the continuous points (along the boundary). The contours are a useful tool for shape analysis and object detection and recognition. Here Contours explained in differentiating each individual character in an image with using contour dilation technique. Create a boundary to each character in an image with using OpenCV Contours method. Character recognition with the use ofOpenCV contours method.
  • OpenCV code implementation in differentiating the words with the use of contours
ret,thresh1 = cv2.threshold(im1,180,255,cv2.THRESH_BINARY_INV)
kernel = np.ones((5,5),np.uint8)
dilated = cv2.dilate(thresh1,kernel,iterations = 2)
_,contours, hierarchy = cv2.findContours(dilated,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cordinates = []
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
cordinates.append((x,y,w,h))
#bound the images
cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),1)

cv2.namedWindow('BindingBox', cv2.WINDOW_NORMAL)
cv2.imwrite('data/BindingBox4.jpg',im)

Naming Convention followed(Labelling) : The extracted text characters should be labelled with the Original character name associated with it.

Naming convention followed here is, last letter of file name should be the name associated with the character for pre-processing the images data.

  • Pre-processing
  1. The raw data depending on the data acquisition type is subjected to a number of preliminary processing steps to make it usable in the descriptive stages of character analysis. The image resulting from scanning process may contain certain amount of noise
  2. Smoothing implies both filling and thinning. Filling eliminates small breaks, gaps and holes in digitized characters while thinning reduces width of line.
(a) noise reduction

(b) normalization of the data and

(c) compression in the amount of information to be retained.

2) Build a ConvNet Model ✂️(Character Recognition Model):

Convolution Network of 8 layers with 2*4 layers residual feedbacks used in remembering the Patterns ✂️ of the Individual Character Images.

  • 1st Model will train on the Individual Character Images with direct Classification to predict the Images with softmax Classification of Character Categories.
  • 2nd Model is same model with last before layer as predictor which will Calculate a Embedding of specified Flatten Neurons ( The Predicted flatten Values will have Feature Information of Receipt Images ).

3) Load Trained ConvNet OCR model:

Optical Character recognition last step involves preprocessing of image into specific word related contours and letter contours, followed by prediction and consolidating according to letter and word related contours in an image.

once after training the model, we can save and load the pre-trained Optical character recognition model.

4) Test and Consolidate Predictions of OCR :

Consolidate predictions involves, assigning specific ID to each word related contour with the line associated with the word in image, Consolidating all predictions in a sorted series of specific word related contour and letters associated word.