A Beginners Guide to Computer Vision (Part 4)- Pyramid

Published in

Analytics Vidhya

4 min readApr 26, 2020

Learn about image pyramids and their implementation

Pyramid, a well known shape to mankind from very long time. Well, this article is not about the “Shape Pyramid” but we are going to talk about “Image Pyramids”.

Sometimes to detect an object (like face or car or anything thing of that sort) in an image, we need to resize or sub-sample the image and run the further analysis. In such cases, we maintain a collection of same image with different resolutions. We call this collection as Image Pyramid. The reason behind calling it as a pyramid is when we arrange the images in decreasing order of their resolution, we obtain a shape like pyramid with square base. look at the image below to understand it in more detail.

Area of the new level will be 1/4 of level below it. If the size of base image(High resolution or level 0) is M X N then the size of level above it will be (M/2 X N/2).

There are two kinds of pyramids: 1) Gaussian Pyramid and 2) Laplacian Pyramid.

Gaussian Pyramid

In Gaussian Pyramid, we apply the Gaussian filter of 5 X 5 size before we sub-sample the image. There two kinds of operations here: Reduce and Expand. as name suggests, in reduce operation we will reduce( half of width and height) the image resolution and in expand operation we will in increase (two times the width and height) the image resolution.

Reduce or Down

The reduce operation in Gaussian Pyramids in done according to the relation given below.

source: https://www.youtube.com/watch?v=KO7jJt0WHag&list=PLd3hlSJsX_ImKP68wfKZJVIPTd8Ie5u-9&index=5

where l represents level, w(m,n) is the window function(Gaussian).

The only difference between reduce and convolution operation is, the value of stride in convolution operation is 1 but in case of reduce operation is 2. we convolve the Gaussian mask with every alternate row and column. Look at the code snippet below for understanding operation further.

lowResImage = cv2.pyrDown(highResImage) # Reduce operation of OpenCV

Expand or Up

The expand operation in Gaussian Pyramids in done according to the relation given below.

where l represents level, w(p,q) is the window function(Gaussian).

To understand the above relation more clearly, lets expand the above relation in one dimension case.

Non-integer terms in above equation will be eliminated finally the equation will have only three terms. Look the image below

The terms a,b,c,d and e in the above image are the Gaussian weights in one dimension. In expand operation, new pixels are created from same old pixels with different gaussian weight combinations.

highResImage = cv2.pyrUp(lowResImage) # Expand operation of OpenCV

Laplacian Pyramid

In Gaussian pyramid, we apply the gaussian blur and sub-sample the image. In Laplacian pyramid, we apply the Laplacian of gaussian and sub-sample the image. You can recall this concept of Laplacian of Gaussian from Marr Hildreth edge detector from my previous article.

A Beginners Guide to Computer Vision (Part 2)- Edge Detection

Learn the algorithms behind edge detection and their implementation.

medium.com

In practice, we create the Laplacian pyramid from the Gaussian pyramid. look the formula below for more understanding.

EPIPHANY, JEBAMALAR LEAVLINE & Sutha, Shunmugam. (2014). Design of FIR Filters for Fast Multiscale Directional Filter Banks. International Journal of u- and e-Service, Science and Technology. 7. 10.14257/ijunesst.2014.7.5.20.

Look at the code snippet below to generate a level in laplacian pyramid.

laplaceLevel1 = gaussianLevel1 - cv2.pyrUp(gaussianLevel2)

Yeah that’s the end is article and stay tuned for more on computer vision. Code of Pyramids is in the link below.

tbharathchandra/A-Beginners-Guide-to-Computer-Vision

Permalink Dismiss GitHub is home to over 40 million developers working together to host and review code, manage…

github.com