Edges Detection in Computer Vision using Convolutions

Traditional approach

5 min readOct 24, 2019

Edges detection using convolution: Color -> Monochrome -> Gradient -> Edges.

Introduction

Isn’t that a wonder how a human brain processes visual information?! I personally believe there is a Creator, a God who carefully designed the eye and the brain. But did you ever wonder how computers process images and detect objects?

Edges are important to convey visual information. Prior to the neural networks boom, convolutions were also used to detect edges in the images.

In this article I will describe that traditional approach to use convolutions for edges detection in computer vision.

An Image as a Function

An image can be represented as a brightness/intensity function of two variables:

where x, y — are horizontal, vertical coordinates respectively (see Figure 1):

Fig. 1. An image as a function example — my profile’s photo, monochrome.

To detect edges we will use partial derivatives of this function to detect where the intensity function has abrupt cliffs in either x or y direction. To calculate derivatives convolution operation is used.

Understanding Convolution

Convolution operation is performed by sliding a kernel (a.k.a. a filter) matrix over the input matrix, performing a Hadamard (a.k.a. element-wise) product of the kernel with the underlying part of input matrix, summing up the result and recording it into the output matrix, see Figure 2.

As a reminder, here is the Hadamard product of matrices:

Fig. 2. Left: the kernel (green) slides over the input image (blue). Right: the result is summed and added to the output matrix. Image credit: Arden Dertat

Let’s say we have this input matrix inp and kernel k (as on the Figure 2):

inp = tensor([[1, 1, 1, 0, 0],
              [0, 1, 1, 1, 0],
              [0, 0, 1, 1, 1],
              [0, 0, 1, 1, 0],
              [0, 1, 1, 0, 0]])k   = tensor([[1, 0, 1],
              [0, 1, 0],
              [1, 0, 1]])

On the first step the kernel is placed in top left corner of input matrix (highligted with bold), and we calculate the Hadamard product:

[[1, 1, 1],   [[1, 0, 1],   [[1, 0, 1],
 [0, 1, 1], *  [0, 1, 0], =  [0, 1, 0],
 [0, 0, 1]]    [1, 0, 1]]    [0, 0, 1]]

Then we sum the result (getting 4) and place it into the top left corner of the output. Then we shift the kernel and repeat the process until reaching the bottow right of the input.

How is this related to gradients? When the kernel is of a special kind like we’ll see below, this convolution operation produces a gradient of original image/function. Let’s see on an example.

A Simpe Image Example

Fig. 3. Source image with selection marked.

For simplicity we will play with a monochrome image which has only one color/brightness channel, see Figure 3.

Here white is encoded as 1, black as 0.

So the edge, selected with the red square on Figure 3, looks like this:

tensor([[1., 1., 1., 1., 0., 0., 0., 0.],
        [1., 1., 1., 1., 0., 0., 0., 0.],
        [1., 1., 1., 1., 0., 0., 0., 0.],
        [1., 1., 1., 1., 0., 0., 0., 0.],
        [1., 1., 1., 1., 0., 0., 0., 0.]])

Vertical edges detection

Fig. 4. Image’s gradient in horizontal direction.

A simple filter for vertical edges detection looks like this:

tensor([[-1.,  0.,  1.],
        [-1.,  0.,  1.],
        [-1.,  0.,  1.]])

When convoluted over the source image, the filter produces an image on Figure 4. The images representing the gradients are encoded as such:

black — the gradient is negative — brightness decreases;
white —the gradient is positive — brightness increases;
grey — the gradient is zero.

Note: The resulting image has two types of vertical edges detected:

black edges — indicating a shift from white to black, when moving left to right — gradient is minimum (represented as zeros);
white edges— indicating a shift from black to white, when moving from top to bottom — gradient is maximum (represented as ones).

The edge, selected with the red square on Figure 2, looks like this:

tensor([[ 0.,  0.,  0., -3., -3.,  0.,  0.,  0.],
        [ 0.,  0.,  0., -3., -3.,  0.,  0.,  0.],
        [ 0.,  0.,  0., -3., -3.,  0.,  0.,  0.],
        [ 0.,  0.,  0., -3., -3.,  0.,  0.,  0.],
        [ 0.,  0.,  0., -3., -3.,  0.,  0.,  0.]])

Horizontal edges detection

Fig. 5. Image’s gradient in vertical direction.

A filter for horizontal edges detection is looks like this (vertical filter transposed):

tensor([[-1., -1., -1.],
        [ 0.,  0.,  0.],
        [ 1.,  1.,  1.]])

When convoluted over the source image, it produces an image on Figure 5.

Result

To get the desired result we need to combine horizontal and vertical gradients. We us that

If images with horizontal and vertical edges detected combined, we get the resulting Figure 6.

Not rocket science, huh!

A more interesting example — a complex image. I will use my profile’s photo.

Fig. 7. My profile’s photo edges detected.

Now see a 3D representation of figure 7, with some tiny edges removed and a color map for clarity.

Fig. 8. Edges of my profile’s photo detected, tiny edges removed.

Conclusion

In this article we explored the meaning of the convolutions and how they are used for edges detection in Computer Vision. Understanding this will help you better grasp how the Convolusional Neural Networks operate.

References

Play with the code for this article (Google Colab notebook).
Learn more: Introduction to Computer Vision by Georgia Tech, on Udacity. An excellent and free course.