Text Detection and Extraction From Image with Python

Handy OCR and OpenCV technique to find text in digital image

Amit Chauhan
The Pythoneers

--

Photo by Alex Chumak on Unsplash

This article will give you a glimpse of extracting text from digital images. We will use python and pytesseract library to extract the text. The image should have text inside it to find the output text.

The extraction of text with pytesseract needs a library to be installed in the system environment. The below commands will help the installation of libraries in your system.

To install the OpenCV library

pip install opencv-python

To install the pytesseract-ocr library

pip install pytesseract

We can also install the setup file of tesseract setup file to get the tesseract.exe file from the below link.

https://github.com/UB-Mannheim/tesseract/wiki

Download the above file as per the system configuration then install it. We will see the tesseract.exe file in the path as shown below:

C:\Program Files\Tesseract-OCR\tesseract.exe"

Let’s see the input image from which we need to extract the text.

In this python example, we will extract text from the grayscale image, and in the next…

--

--