Optical Character Recognition

Meftun Akarsu
MOVE ON AI
Published in
5 min readMar 2, 2022
Photo by Ion Fet on Unsplash

OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper documents, PDF files or pictures taken with a digital camera into editable and searchable data. OCR creates words from letters and sentences from words by selecting and separating letters from images.

In this article, we compare Keras OCR, PyTesseract and EasyOCR.

If you don’t have any prior knowledge, I can recommend it.

KERAS OCR:

This is a slightly polished and packaged version of the Keras CRNN implementation and the published CRAFT text detection model. It provides a high level API for training a text detection and OCR pipeline.

The package ships with an easy-to-use implementation of the CRAFT text detection model from this repository and the CRNN recognition model from this repository.

Installation

pip install keras-ocr

Usage

First import library:

import matplotlib.pyplot as plt
import keras_ocr

keras-ocr will automatically download pretrained weights for the detector and recognizer.

pipeline = keras_ocr.pipeline.Pipeline()

Get a set of thee example images:

images = [
keras_ocr.tools.read(img) for img in ['path/img1.jpg',
'path/img2.jpg',
'path/img3.jpg']]

Each list of predictions in prediction_gropus is a list of (word,box) tuples:

prediction_groups = pipeline.recognize(images)

The last step is plot the predictions:

fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))for ax, image, predictions in zip(axs, images, prediction_groups):
keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)

EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including: Latin, Turkish, Chinese, etc.

Installation

pip install easyocr

Usage

We need to tell EasyOCR which language we want to read. This model can read multiple languages at the same time. English is compatible with all languages. Languages that share most of character (e.g. latin script) with each other are compatible.

So first we have to select a language. You can access the language available: supported languages.

import easyocrreader = easyocr.Reader(['tr','en'])

To get text from image, just pass your image path to readtext function like this.

result = reader.readtext(‘path/image.jpg’)

The output will be in a list format, each item represents a bounding box, the text detected and confident level, respectively.

([[482, 418], [633, 418], [633, 494], [482, 494]],'Text1',0.9577),
([[331, 421], [453, 421], [453, 487], [331, 487]], 'Text2', 0.9630),
([[653, 429], [769, 429], [769, 495], [653, 495]], 'Text3', 0.9243),
([[797, 429], [939, 429], [939, 497], [797, 497]],'Text4',0.6400)]

PyTesseract

Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images.

Installation

pip install pytesseract

Usage

Import libraries:

from PIL import Image
import pytesseract

Simple image to string:

print(pytesseract.image_to_string(Image.open('path/image1.jpg')))

COMPARISON OF ALL

First importing libraries and images.

import pytesseract
import keras_ocr
import easyocr
import matplotlib.pyplot as plt
import cv2
import numpy as np
import pytesseract

Visualization your image

Create OCR class:

Embed our load_img function in the constructor. We allocate our models with ….._model_load() functions as you can see.

I also added some visualization lines.

Lets initialize ocr object.

ocr=OCR(image_folder=”test/”)

To test Keras OCR

Call keras_model_works() method

ocr.keras_ocr_works()

KERAS OCR image 1
KERAS OCR image 2
KERAS OCR image 3

To test Easy OCR

Call easyocr_model_works() method

ocr.easyocr_model_works()

TR
414
DK 205
(DUR
THE
RAINBCW
FONT
ABCDEFGHIJKLMN
OPQRSTUVWXYZ
illisgois
LanoLLincoln
American Super Cars
APR
568 612
M
'89IL
MAR
82tui
MoToRShOW
TEXAS
@
Arizona
RY
RK
611 BUc |KB5]9
Grano camyom staten
XAS
MY TRUCK
HZK
THE LONE 5{
BARTIN
74
Nufus:
66100
Rakım:
2 5
Sınırı
Bölgesi
50
Hız
Alergico

To test Pytesseract

Call pytesseract_model_works() method

ocr.pytesseract_model_works()

3 images has no results.

ABCDEFGHIJKLM
OPQRSTUVWKXYZ
BARTINNuifus: 66100]!
Rakim: Pipe)
Alergico

CONCLUSIONS

  • It seems that pytesseract is not very good at detecting text in the entire image and converting str. Instead, text should be detected first with text detection and the texts have to given OCR engines.
  • While keras_ocr is good in terms of accuracy but it is costly in terms of time. Also if you’re using CPU, time might be an issue for you.
  • Keras-OCR is image specific OCR tool. If text is inside the image and their fonts and colors are unorganized.
  • Easy-OCR is lightweight model which is giving a good performance for receipt or PDF conversion. It is giving more accurate results with organized texts like PDF files, receipts, bills. Easy OCR also performs well on noisy images.
  • Pytesseract is performing well for high-resolution images. Certain morphological operations such as dilation, erosion, OTSU binarization can help increase pytesseract performance.
  • All these results can be further improved by performing specific image operations.
  • OCR Prediction is not only dependent on the model and also on a lot of other factors like clarity, grey scale of the image, hyper parameter, weight age given, etc.

Thank you.

References

--

--