3. Introduction to Computer Vision with Deep Learning: Channels and feature maps

Inside AI
Deep-Learning-For-Computer-Vision
4 min readMay 20, 2019

Written by Praveen Kumar and Nilesh Singh.

Prerequisite- In case you missed the first session, please go ahead to the previous session on kernels.

By now, you know all the basics about kernels, with hands-on experience. We are sure you tried a lot of different kernels and enjoyed the first complete session on kernels.

Before we begin…

NOTE: We are starting a new telegram group to tackle all the questions and any sort of queries. You can openly discuss concepts with other participants and get more insights and this will be more helpful as we move further down the publication. (Telegram is preferred over Whatsapp because of group member constraints)[Follow this LINK to join in ASAP]

Now let’s come back and switch our attention to another very important concept in computer vision. Any guesses? Treat yourself if you guessed channels.

To understand channels, let’s assume you are taking pictures of a dog using your phone. Now let’s split that image into channels and see what happens? Take a look at it now,

Image 1

What do you see? I’m sure it’s very difficult to know now that it is a picture of a dog. But we also see that it is made up of 3 channels. These layers, when combined, produce the final image which you just clicked. Let’s dive more into channels with some technicalities.

The red channel is formed when all the red color, wherever present in the image, is extracted. This is called a feature map of red color or simply a feature map. So we can define a feature map as map formed by collecting all the similar features from the image. The features, in turn, are the output of kernels. So, whenever a kernel extracts some feature, it results in a feature but not a feature map. Feature map is produced when the kernel extracts all the similar features from the image. Hence, it’s very important to understand that the output of a kernel is a feature and all the similar features combine to form a feature map. This feature map is a single channel which is formed by a feature map.

Whoa!! That was a little heavy. But we will give you more examples of this to make you understand more intuitively.

Let’s take a look at the following images to understand all these terms visually.

Image 2

Here the source is the input image and we apply a 3x3 kernel over it. It produces -1 as a feature. This is just a feature. When the kernel extracts all the features from the image, that is, all the pixels in the result matrix will be filled, this completed result matrix is called a feature map produced by the kernel. Hence we can say that output of a kernel is a feature and we get a feature map by collecting all such similar features from the image. So, one feature map forms a channel.

Now let’s take a look at another image:

Image 3

In this image, we have 4 kernels (Red, Blue, Green, and Yellow). When these kernels extract all the features by running over the image, we get 4 feature maps (or channels), that is, Red color feature map, Blue color feature map, blue color feature map, and yellow color feature map respectively.

Phewww!!! I guess at least now you have better intuition about channels. You do not need to worry if you do not completely understand channels at this point. We will be covering channels more and more as we move deeper into our publication. So let’s just end this article here and take all the intuitions from this article to the upcoming ones.

To summarize, take a look at these intuitive definitions.

Feature: Output of a kernel is a feature. (As we have seen, -1 is a feature in image 2)

Feature map: A collection of all the similar features from an image.

Channels: Each feature map results into one channel.

Stay tuned to know all about channels and kernels, their relations and how they change based on the input image as well as kernels/filters.

NOTE: we are starting a new telegram group to tackle all the questions and doubts. You can openly discuss concepts with other participants and get more insights and this will be more helpful as we move further down the publication. (Telegram is preferred over Whatsapp because of total number group member constraints)[Follow this LINK to join in ASAP]

Hope you enjoyed it. See you soon…

--

--

Inside AI
Deep-Learning-For-Computer-Vision

We write about NLP, Speech Recognition, Computer Vision, Kaggle, and Data Science Competitions.