A Box detection algorithm for any image containing boxes.

Kanan Vyas
Coinmonks
6 min readJul 22, 2018

--

When you are working with Optical character recognition(OCR) or any data or object recognition problem, the first thing to do is preprocessing. Here preprocessing means to extract the location where our information is located. After extracting the location, any machine algorithm will be performed on that image.

The problem arises when you have to detect objects which are located in any tables/boxes or in row-column format. If the image is like this then you have to detect boxes and extract them one by one. Now it should be done accurately for all images. As an example, see this following image:

Example of an image for extracting information.

Here for this image I want to do Optical Character Recognition for all the equations. I want to extract each cell one by one(not any blank) to detect the numbers. After extracting each cell I will do segmentation for all the numbers and apply my ML model to do recognition.For this algorithm we will use python language by using opencv and numpy.So let’s start extracting each cell one-by-one:

First import some libraries:

import cv2
import numpy as np

Now read the image,Convert it into grayscale, Do thresholding and invert the image.

# Read the image
img = cv2.imread(img_for_box_extraction_path, 0)

# Thresholding the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,cv2.THRESH_BINARY| cv2.THRESH_OTSU)
# Invert the image
img_bin = 255-img_bin
cv2.imwrite("Image_bin.jpg",img_bin)

So our image will look like this:

image_bin.jpg

Now we need to detect boxes.For that we will use morphological operations.For that We will define rectangular kernel with the length based on the width of the image.We will define two kernels. 1) Kernel to detect horizontal lines. 2) Kernel to detect vertical lines.

# Defining a kernel length
kernel_length = np.array(img).shape[1]//80

# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))

Now after defining kernels we will do morphological operations to detect the vertical and horizontal lines.Below code shows the image containing vertical lines.

# Morphological operation to detect vertical lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
cv2.imwrite("verticle_lines.jpg",verticle_lines_img)
# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)
Image containing vertical lines
Image containing horizontal lines

Now we will add these two images.This will have only boxes and the information written in the box will be erased.So we can accurately detect the boxes and no noise will occur for false box extraction.

# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128,255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite("img_final_bin.jpg",img_final_bin)Thank
Final image containing only boxes

Now we will apply findContours() method to this image. This will find all the boxes and we will sort them from top to bottom.For sorting the contours we will use the function provided by https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/. We will use top-to-bottom approach.

# Find contours for image, which will detect all the boxes
im2, contours, hierarchy = cv2.findContours(img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")

Now loop over all the contours,find the location of all the boxes and crop the part which has a rectangle and save it into one folder.

idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)
if (w > 80 and h > 20) and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(idx) + '.png', new_img)
# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 80 and h > 20) and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(idx) + '.png', new_img)

Now it’s Done!. Check your folder and you will see images containing every extracted boxes.Like this:

Extracted images

So now you can use this images for further implementation.You can change the kernel_length parameter by increasing to get a good output in very large image.

Note: This method is applicable everywhere, For detecting data from OMR sheets to any excel sheets. This method uses normal morphological operations and it erased all the inner information so there will be no noise coming in for false detection of boxes. You can use the following method as a preprocessing and get a good output.:)

The whole code for box detection is here:

import cv2
import numpy as npThank
def box_extraction(img_for_box_extraction_path, cropped_dir_path):img = cv2.imread(img_for_box_extraction_path, 0) # Read the image
(thresh, img_bin) = cv2.threshold(img, 128, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU) # Thresholding the image
img_bin = 255-img_bin # Invert the image
cv2.imwrite("Image_bin.jpg",img_bin)

# Defining a kernel length
kernel_length = np.array(img).shape[1]//40

# A verticle kernel of (1 X kernel_length), which will detect all the verticle lines from the image.
verticle_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_length))
# A horizontal kernel of (kernel_length X 1), which will help to detect all the horizontal line from the image.
hori_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_length, 1))
# A kernel of (3 X 3) ones.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
# Morphological operation to detect verticle lines from an image
img_temp1 = cv2.erode(img_bin, verticle_kernel, iterations=3)
verticle_lines_img = cv2.dilate(img_temp1, verticle_kernel, iterations=3)
cv2.imwrite("verticle_lines.jpg",verticle_lines_img)
# Morphological operation to detect horizontal lines from an image
img_temp2 = cv2.erode(img_bin, hori_kernel, iterations=3)
horizontal_lines_img = cv2.dilate(img_temp2, hori_kernel, iterations=3)
cv2.imwrite("horizontal_lines.jpg",horizontal_lines_img)
# Weighting parameters, this will decide the quantity of an image to be added to make a new image.
alpha = 0.5
beta = 1.0 - alpha
# This function helps to add two image with specific weight parameter to get a third image as summation of two image.
img_final_bin = cv2.addWeighted(verticle_lines_img, alpha, horizontal_lines_img, beta, 0.0)
img_final_bin = cv2.erode(~img_final_bin, kernel, iterations=2)
(thresh, img_final_bin) = cv2.threshold(img_final_bin, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# For Debugging
# Enable this line to see verticle and horizontal lines in the image which is used to find boxes
cv2.imwrite("img_final_bin.jpg",img_final_bin)
# Find contours for image, which will detect all the boxes
im2, contours, hierarchy = cv2.findContours(
img_final_bin, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Sort all the contours by top to bottom.
(contours, boundingBoxes) = sort_contours(contours, method="top-to-bottom")
idx = 0
for c in contours:
# Returns the location and width,height for every contour
x, y, w, h = cv2.boundingRect(c)
# If the box height is greater then 20, widht is >80, then only save it as a box in "cropped/" folder.
if (w > 80 and h > 20) and w > 3*h:
idx += 1
new_img = img[y:y+h, x:x+w]
cv2.imwrite(cropped_dir_path+str(idx) + '.png', new_img)
box_extraction("41.jpg", "./Cropped/")

You can see the full source code here:

Thank You!

Join Coinmonks Telegram Channel and Youtube Channel get daily Crypto News

Also, Read

--

--

Kanan Vyas
Coinmonks

Software Engineer in ML | Founder at @clique_org | Interested in Cricket and Kathak.