How to extract text from any image with Deep Learning

Alessandro Lamberti
Artificialis
Published in
3 min readSep 18, 2021

--

Photo by Patrick Tomasso on Unsplash

Update: check out Hydra AI: Accurately extract text from any image

Extracting text from images is a task called Optical Character Recognition (OCR).
More technically:

It is the conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

How is OCR different from a scanner? A scanner merely copies the paper as an image file, so you cannot copy and paste from the document. OCR translates a document into an editable format.

Guess what? You can definitely perform OCR with Python and just a bunch of lines of code!
We’re going to use the EasyOCR package.

EasyOCR overview

EasyOCR is implemented using Python and PyTorch. If you have a CUDA-capable GPU, the underlying PyTorch can speed up your text detection and OCR speed, a lot!

As of now, the library supports 80+ languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari…

--

--