Image Data: Let’s talk in numbers

Published in

Let’s Deploy Data.

2 min readJul 17, 2020

I have been selected for the Machine Learning Scholarship for Microsoft Azure. Knowing the value of a fantastic community and Udacity course material, I dived into the course straight away. I hit my first roadblock when I was introduced to Image Data.

I do know about pixels, RBG, grayscale images, but the concept of encoding the data into numeric values was new to me. Since I was now a part of an impressive and very knowledgable community, I decided to ask for their help. Finally, after a lot of discussion and guidance, I found out the relationship between colored images, pixels, RBG, and channels. This made me think that I should share my newly found knowledge. So, let’s get started.

What are pixels?

Pixels are the smallest elements that form an image — the more the number of pixels in an image, the better the picture quality. Pixels are a part of the grid system, and the location of each pixel can be specified by its value in the x-axis and y-axis.

Number of Pixels= Resolution of Image=Height*Width

RBG Image

An RBG image is an M*N*3 data array that includes information about red, blue, and green color components for each pixel. It consists of three independent image channels, one in each of the primary colors. Each color has it’s own intensity. The number of channels required to represent the color is called the color depth or bits per pixel. For colored images, it has a minimum value of 3.

Grayscale Images

Images encoded with a single component, i.e., they have a color depth of 1, are said to be grayscale, where 0 is black, and 255 is white. Grayscale images contain only shades of gray, black, and white, and a single number represents them.

How to encode an Image?

We can encode an image numerically by using a vector having three dimensions, which are height, width, and depth. Images have multiple rows and columns of pixels, and each pixel stores three color values and has three channels. The value of the color in each pixel tells us how bright that color is in that pixel.

To encode an image, we need to know the horizontal position, the vertical position, and the color of each pixel.

When encoding an image, keep in mind that it should have a uniform aspect ratio and is normalized.

Conclusion:

One of the most common doubts that the community had was whether an image of height 4 units and width 4 units has 16 pixels or 16*3 pixels where 3 is the number of channels/color depth. The answer would be 16 pixels because each pixel has 3 channels which give information about the intensity of colors in that pixel.