Text Detection in Images with EasyOCR in Python
Optical character recognition (OCR) is an important technology that allows computers to identify text in images and convert it into machine-readable text. This enables the extraction of text from scanned documents, photos, screenshots, and more for further natural language processing.
In this article, we will use the easyocr Python library to detect and recognize text in images. easyocr provides a simple API for OCR that does not require training a model. It is built on top of PyTorch and TensorFlow and can detect text in over 80 languages out-of-the-box.
Importing Libraries
We first import the necessary libraries:
import cv2
import easyocr
import matplotlib.pyplot as plt
- cv2: OpenCV library for image processing and computer vision
- easyocr: Library for optical character recognition
- matplotlib.pyplot: Library for visualization and plotting
Defining Util Functions
We define a helper function to draw bounding boxes around detected text and display the image:
def draw_bounding_boxes(image, detections, threshold=0.25):
for bbox, text, score in detections:
if score > threshold:
cv2.rectangle(image, tuple(map(int, bbox[0])), tuple(map(int, bbox[2])), (0, 255, 0), 5)
cv2.putText(image, text, tuple(map(int, bbox[0])), cv2.FONT_HERSHEY_COMPLEX_SMALL, 0.65, (255, 0, 0), 2)
The draw_bounding_boxes function takes the input image, the detected text from EasyOCR, and an optional threshold score as arguments.
It loops through each detected text region returned by EasyOCR, which contains the bounding box coordinates, the recognized text, and confidence score for that detection.
For each detection, it checks if the confidence score exceeds the threshold we defined (default 0.25). This filters out weak or dubious detections.
For the detections that pass the threshold, it does the following:
- Draws a green bounding box on the input image to highlight the text region. cv2.rectangle() is used to draw the rectangle using the bounding box coordinates.
- Annotates the bounding box with the detected text in red color using cv2.putText(). This overlays the recognized text on top of the image.
- It converts the floating point bbox coordinates to integers using map(int, bbox[0]) and bbox[2] so they can be used by the OpenCV drawing functions.
- The text is placed at the top-left coordinate of the bbox using bbox[0].
- Font style, size, and thickness are also specified for the text overlay.
Loading the Image
We load the input image to run OCR on:
image_path = "image/preview.jpg"
img = cv2.imread(image_path)
if img is None:
raise ValueError("Error loading the image. Please check the file path.")
- First we define the path to the input image file — “image/preview.jpg”
- We then use OpenCV’s imread() function to load the image from that path. imread() loads the image in BGR format by default.
- The img variable will hold the loaded image array if successful.
- We add a check to see if img is None, which means the image failed to load.
- In case of a failure, we raise a ValueError with a custom error message asking to check the file path.
- This ensures that the code will exit gracefully if the image cannot be loaded, instead of causing crashes later.
- We can add more handling here like logging the error, showing the exception stack trace, etc.
Running Text Detection
We instantiate the easyocr reader to detect English text and disable the GPU to make use of CPU, but if you have GPU you can turn it true for faster inference:
reader = easyocr.Reader(['en'], gpu=False)
Then we call the .readtext()
method to run text detection and recognition on the image:
text_detections = reader.readtext(img)
This returns a list of tuples for each text detected, containing the bounding box coordinates, text, and detection confidence score.
We set a score threshold of 0.25 to filter weak detections.
Visualizing the Output
Finally, we call our draw bounding boxes function to annotate the image with the text detections:
threshold = 0.25
draw_bounding_boxes(img, text_detections, threshold)
We display the image with matplotlib:
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGBA))
plt.show()
The green bounding boxes illustrate the detected text regions, with the recognized text overlaid in blue.
And that’s it! With just a few lines of code, we can run performant text detection and OCR on images using easyocr in Python. The library handles the model architecture and optimizations, while providing a simple interface to integrate it into any image processing pipeline.