Image Histograms in OpenCV

Raghunath D
7 min readJan 29, 2019

--

Understanding image histograms using OpenCV

A histogram is a very important tool in Image processing. It is a graphical representation of the distribution of data. An image histogram gives a graphical representation of the distribution of pixel intensities in a digital image.

The x-axis indicates the range of values the variable can take. This range can be divided into a series of intervals called bins. The y-axis shows the count of how many values fall within that interval or bin.

  • When plotting the histogram we have the pixel intensity in the X-axis and the frequency in the Y-axis. As any other histogram we can decide how many bins to use.

A histogram can be calculated both for the gray-scale image and for the colored image. In the first case we have a single channel, hence a single histogram. In the second case we have 3 channels, hence 3 histograms.

Calculating the histogram of an image is very useful as it gives an intuition regarding some properties of the image such as the tonal range, the contrast and the brightness.

→ To identify the dominant colors in an image, we can use the histogram plot of the Hue channel.

In an image histogram, the x-axis represents the different color values, which lie between 0 and 255, and the y-axis represents the number of times a particular intensity value occurs in the image.

Calculating the Histogram

OpenCV provides the function cv2.calcHist to calculate the histogram of an image. The signature is the following:

cv2.calcHist(images, channels, mask, bins, ranges)

where:
1. images - is the image we want to calculate the histogram of wrapped as a list, so if our image is in variable image we will pass [image],
2. channels - is the the index of the channels to consider wrapped as a list ([0] for gray-scale images as there's only one channel and [0], [1] or [2] for color images if we want to consider the channel green, blue or red respectively),
3. mask - is a mask to be applied on the image if we want to consider only a specific region (we're gonna ignore this in this post),
4. bins - is a list containing the number of bins to use for each channel,
5. ranges - is the range of the possible pixel values which is [0, 256] in case of RGB color space (where 256 is not inclusive).

The returned value hist is a numpy.ndarray with shape (n_bins, 1) where hist[i][0] is the number of pixels having an intensity value in the range of the i-th bin.

We can simplify this interface by wrapping it with a function that in addition to calculate the histogram it also draws it (at the moment we’re going to fix the number of bins to 256):

import cv2
from matplotlib import pyplot as plt

def draw_image_histogram(image, channels, color='k'):
hist = cv2.calcHist([image], channels, None, [256], [0, 256])
plt.plot(hist, color=color)
plt.xlim([0, 256])

Let’s now see the histograms of these three sample images:

Gray-scale histogram

Plotting histogram for a gray-scale image.

import cv2
import matplotlib.pyplot as plt
image = cv2.imread('dark-tones.jpg')gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)histogram = cv2.calcHist([gray_image], [0], None, [256], [0, 256])
plt.plot(histogram, color='k')
plt.show()

If we plot histogram for all the images shown above, we get histogram plots like this:

Let’s now analyze this plots and see what kind of information we can extract from them.

  1. From the first one we can infer that the all the pixels of the corresponding image have low intensity as their almost all in the [0, 60] range approximately.
  2. From the second one we can see that the distribution of the pixel intensities is still more skewed over the darker side as the median value is around 80, but the variance is much larger.
  3. Then from the last one we can infer that the corresponding image is much lighter overall, but also have few dark regions.

Here are the gray-scale images with the corresponding histograms:

Color Histogram

Let’s now move onto the histograms of the colored sample images.

import cv2
import matplotlib.pyplot as plt
image = cv2.imread('dark-tones.jpg')for i, col in enumerate(['b', 'g', 'r']):
hist = cv2.calcHist([image], [i], None, [256], [0, 256])
plt.plot(hist, color = col)
plt.xlim([0, 256])

plt.show()

If we execute this function for the sample images we obtain the following histograms:

The plots are in the same order of the sample images.

  1. As we could have expected from the first plot, we can see that all the channels have low intensities corresponding to very dark red, green and blue. We also have to consider that the color black, which is given by (0, 0, 0) in RGB, is abundant in the corresponding image and that may explain why all the channels have peaks in the lower part of the X axis. Anyway this can't be much appreciated in with type of visualization given that we're plotting the three channels independently from each other. Later we will see how we can observe the distribution of the combination of the channels' values by using multi-dimensional histograms.
  2. From the second plot we can observe that there’s a dark red peak that may correspond to the rocks and the mountains while both the green and the blue channel have a wider range of values.
  3. From the last plot, if we exclude the peaks of all the channels in the interval [0, 30], we can observe the opposite of what we saw in the first plot. All the three channels have high intensities and if we consider that (255, 255, 255) in RGB corresponds to white, then by looking at the image it's clear why the histogram has this distribution.

Here are the sample images with the corresponding histograms:

Histogram Equalization

The histogram equalization process is an image processing method to adjust the contrast of an image by modifying the image’s histogram.

The intuition behind this process is that histograms with large peaks correspond to images with low contrast where the background and the foreground are both dark or both light. Hence histogram equalization stretches the peak across the whole range of values leading to an improvement in the global contrast of an image.

It is usually applied to gray-scale images and it tends to produce unrealistic effects, but it is highly used where a high contrast is needed such as in medical or satellite images.

OpenCV provides the function cv2.equalizeHist to equalize the histogram of an image. The signature is the following:

cv2.equalizeHist(image)

Histogram equalization for gray scaled images:

Let’s now see how we can easily equalize a gray-scale image and show it. Here’s the code:

def show_grayscale_equalized(image):
grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
eq_grayscale_image = cv2.equalizeHist(grayscale_image)
plt.imshow(eq_grayscale_image, cmap='gray')
plt.show()

Histogram equalization for colored images:

The most naive approach consists in applying the same process to all the three RGB channels separately and rejoining them together. The problem is that this process changes the the relative distributions of the color and may consequently yield to dramatic changes in the image's color balance.

def show_rgb_equalized(image):
channels = cv2.split(image)
eq_channels = []
for ch, color in zip(channels, ['B', 'G', 'R']):
eq_channels.append(cv2.equalizeHist(ch))

eq_image = cv2.merge(eq_channels)
eq_image = cv2.cvtColor(eq_image, cv2.COLOR_BGR2RGB)
plt.imshow(eq_image)
plt.show()

Example:

import cv2
import matplotlib.pyplot as plt
image = cv2.imread('dark-tones.jpg')###############
# Histogram Equalization
channels = cv2.split(image)eq_channels = []
for ch, color in zip(channels, ['B', 'G', 'R']):
eq_channels.append(cv2.equalizeHist(ch))
eq_image = cv2.merge(eq_channels)cv2.namedWindow("Original", cv2.WINDOW_AUTOSIZE)
cv2.namedWindow("Equalized Image", cv2.WINDOW_AUTOSIZE)
cv2.imshow("Original", image)
cv2.imshow("Equalized Image", eq_image)
cv2.waitKey()
cv2.destroyAllWindows()
############
# Plot histogram for equalized image
# show Histogram
channels = ('b', 'g', 'r')
# we now separate the colors and plot each in the Histogram
for i, color in enumerate(channels):
histogram = cv2.calcHist([eq_image], [i], None, [256], [0, 256])
plt.plot(histogram, color=color)
plt.xlim([0, 256])
plt.show()

An alternative is to first convert the image to the HSV color space and then apply the histogram equalization only on the lightness or value channel by leaving the hue and the saturation of the image unchanged.

Here's the code that applies the histogram equalization on the value channel of the HSV color space:

def show_hsv_equalized(image):
H, S, V = cv2.split(cv2.cvtColor(image, cv2.COLOR_BGR2HSV))
eq_V = cv2.equalizeHist(V)
eq_image = cv2.cvtColor(cv2.merge([H, S, eq_V]), cv2.COLOR_HSV2RGB)
plt.imshow(eq_image)
plt.show()

There are also other algorithms for histogram equalization that are more robust such as AHE (Adaptive Histogram Equalization) and CLAHE (Contrast Limited Adaptive Histogram Equalization).

Image histograms are simple, but largely used in image processing. One interesting application is the usage of image histograms to build an image search engine based on the similarity between them such explained in this blog post.

Sources

Image histograms

--

--

Raghunath D

Software Engineer working in Oracle. Data Enthusiast interested in Computer Vision and wanna be a Machine learning engineer.