A straightforward introduction to Image Thresholding using python

Sagar Kumar
Spinor
Published in
6 min readOct 2, 2019
Original Image vs Thresholded Image (Source: Wikipedia )

What is Thresholding?

Thresholding is a process of dividing an image into two (or more) classes of pixels, i.e. “foreground” and “background”. It is mostly used in various Image processing tasks, such as eliminating noise in the OCR process which allows greater image recognition accuracy, segmentation etc.

In order to obtain a thresholded image, usually, we convert the original image into a grayscale image and then apply the thresholding technique. This method is also known as Binarization as we convert the image into a binarized form, i.e. if the value of a pixel is lesser than the threshold value, convert it to 0(Black). If the value of a pixel is greater than the threshold value, convert it to 1(White) or vice-versa.

# A simple threshold function pseudo code 
if pixel_value > thresh
pixel_value = maxValue
else
pixel_value = 0

Let’s understand the importance of binarization with an example on OCR:

NOTE: For all my examples I am using pytesseract (a wrapper for google’s tesseract), Matplotlib and OpenCV.

Therefore please install all the above-mentioned libraries. For me, as I was working on a Colab Notebook, I need to install the pytesseract only.

# To install pytesseract
!sudo apt install tesseract-ocr
!pip install pytesseract

And import them.

from pytesseract import image_to_string
import cv2
import matplotlib.pyplot as plt
from google.colab.patches import cv2_imshow
%matplotlib inline

Okay, now suppose you have an image like below and you have to detect the text in it-

First, read the Image file, please make sure it is supported by OpenCV and then pass the object of this image to the image_to_string function as follows

img = cv2.imread(‘OCR0.png’, 0) # '0' = read the image as grayscale
text= image_to_string(img)
print(text)

Output:-

Kindness is the
language which the
deaf can hear and
blind can see.

As you can see the code has detected the right text. But will it perform the same on a noisy image like below?

img = cv2.imread(‘OCR1.png’, 0)
text= image_to_string(img)
print(text)

Output:-

ness is the
L ge which the
deaf.can,hear a
blind can se

Clearly, the output is not what you might have expected. let’s try the binary thresholding technique and then check the result.

# Binary Thresholding
# turn the pixel white which have values between 100–255
ret, img_binary = cv2.threshold(img,100, 255, cv2.THRESH_BINARY)
# Plot the images
images = [img, img_binary]
titles = [‘Original image’, ‘THRESH_BINARY’]
plot_img(images, titles)
# Now test the result again after thresholding
text=image_to_string(img_binary)
print(text)

Output:-

Kindness is the
language which the
deaf can hear and
blind can see.

Great!!!!

From the above result, you might get the clue, why the threshold techniques are important in digital image processing tasks.

Simple Thresholding

If the pixel value is greater than a threshold value, it is assigned one value (maybe white), else it is assigned another value (maybe black). The function used is

cv2.threshold(img, thresh_value, maxVal, style)
  • The first argument is the source image(grayscale image).
  • The second argument is the threshold value which is used to classify the pixel values.
  • The third argument is the maxVal which represents the value to be given if the pixel value is more than (sometimes less than) the threshold value.
  • The fourth argument is the style of thresholding. OpenCV provides different styles of thresholding. Check the Documentation page.

Note:- Although we get the correct text for that image using simple threshold, however, this may not be the case with some other image. There we need to change the threshold value as accordingly or we need to perform some more operations to get the correct result.

Apart from the simple threshold, OpenCV provides more functions for thresholding such as Adaptive thresholding and Otsu’s Binarization.

Let’s check another example:-

Following is an image of a Shopfront.

https://s2.geograph.org.uk/geophotos/05/37/74/5377402_b2ace888_1024x1024.jpg

First, open the image in grayscale mode and test the tesseract

img2 = cv2.imread(‘5377402_b2ace888_1024x1024.jpg’, 0)
plt.imshow(img3,'gray')
plt.show()
# Now test the result
text=image_to_string(img2)
print(text)

Unfortunately, for this image, it didn’t output any text. So what went wrong?

It looks like the background cover more area than the texts, Let’s crop that particular area and then test again.

# crop the image
crop_img = img3[ 20:150,420:717]
plt.imshow(crop_img,'gray')
plt.show()
# Now test the result
text=image_to_string(crop_img)
print(text)

Output:-

MAGDALENKA
DELICATESSEN
POLISH

Again we can see that the result is not accurate, it didn’t detect ‘FOOD’ written at the bottom right of the image. Now this time I will try the Adaptive threshold. Following is the syntax for it.

cv2.adaptiveThreshold(img, maxValue, adaptiveMethod, thresholdType, blockSize, C)

Adaptive threshold

In this, the algorithm calculates the threshold for small regions of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.

# Adaptive Threshold Mean C
img_thresh_mean_c = cv2.adaptiveThreshold(crop_img, 255,
cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY_INV,
17,-2)
# plot the result
plt.imshow(img_thresh_mean_c,'gray')
plt.show()
# test the output of adaptive threshold
text=image_to_string(img_thresh_mean_c)
print(text)

Output:-

" MAGDALENKA 
DELICATESSEN
POLISH FOOD,

Though the code produces the correct text, yet it interprets noises as Characters as well. To overcome this, we need to blur the image a little bit and then test again.

# Here we blur the image by using a 7x7 filter
img_thresh_mean_c_blur = cv2.blur(img_thresh_mean_c,(7,7))
# test the output of blur image
text=image_to_string(img_thresh_mean_blur)
print(text)

Output:-

MAGDALENKA 
DELICATESSEN
POLISH FOOD

Perfect!

Image blurring is achieved by convolving the image with a normalized box filter. It simply takes the average of all the pixels under the kernel area and replaces the central element with this average. This is done by the function cv2.blur(). It is useful for removing noise in the image. Blurring is a separate topic in itself, so I will cover it later in another post.

Conclusion

Image Thresholding is one of the most commonly used technique in many image processing tasks. However, we have to keep in mind that for perfect segmentation we need to try different threshold values.

Note:- Thresholding gives better results if the image is of higher contrast. Try to change the contrast value of the pixels and then apply the threshold.

In the end, I want you to try this by yourself and see what results you’ll get with different images. You can try to find other texts in the Shopfront image or try this image and share your reviews. Are you able to find the text in the image?

Hint: Try cv2.THRESH_BINARY in cv2.threshold() with thresh value = 1

For more details please check out this link and Notebook, it has an in-depth analysis of various thresholding techniques that you can also apply. Thanks to Ankit Kumar Singh for sharing his Notebook with us.

--

--

Sagar Kumar
Spinor
Editor for

Sagar is a computer vision and robotics expert with a focus on Perception & Localization | Twitter: twitter.com/sagarcadet | Linkedin: linkedin.com/in/sagark30