Image Processing in Computer

5 min readFeb 2, 2023

--

HOW DOES THE MACHINE READ IMAGES AND USE THEM IN COMPUTER VISION?

Before we look deeply into how machines read images and use them in computer vision, it’s important to understand how we can read and store images in machines. This is especially key if we are working on computer vision applications. So in this article, we are going to discuss images and going to see how are they actually stored on a computer. We are going to cover two popular formats in which images are saved — Grayscale and RGB format.

What is Pixel?

Pixel is basic building blocks of digital images. Pixel is smallest controllable element of an image.

Eagle eye is made on of several pixels when we zoom in the smallest element of computer display appears which is 1 pixel

How images are stored in a Computer?

First try to understand how black and white images are stored in Computer then we will see how coloured images are stored. Since computer circuits always prefer binary digits it will better if we stored images in binary format!
The image show below is of 7x7 pixel image i.e. (7rows and 7columns). Which means dimension of image is 7x7.

Each of these pixels is denoted as the numerical value and these numbers are called Pixel Values. These pixel values denote the intensity of the pixels. For a b&w image, we have pixel values ranging from 0 to 1 which in binary format 0 stands for black and 1 stands for white

If we removed the color and place the binary value it will look like this.
Now can you guess the shape of this matrix? Well, it will be the same as the number of pixel values across the height and width of the image. In this case, the shape of the matrix would be 7 x 7.

So every image in a computer is saved in this form where you have a matrix of numbers and this matrix is also known as a Channel.

Channel(Matrix) in image processing shown

What is grayscale image representation

Grayscale images are monochrome images, Means they have only one color. Grayscale images do not contain any information about color.In grayscale representation instead of black and white we can have different shades of grey color. Thus the name grayscale!

A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical images and astronomy, 12 or 16 bits/pixel images are used.

Pixel at (1,1) is pure white and pixel at (7,7) pure black others are different greyshades.

For a grayscale monochromatic image, we have pixel values ranging from 0 to 255.The smaller numbers closer to zero represent the darker shade while the larger numbers closer to 255 represent the lighter or the white shade.

This is how grayscale image stored with value range in 0–255.

How Colored images are stored on a computer?

A colored image is composed of multiple colors and all colors can be generated from three (red, green and blue) colors. So colored images are stack of 3 color channels they are in order of RGB channel.

Below shown illustraion can clearly ellaborate this.

This picture is 7x7 colored image. Modern colored digital images are also follow same principles of using 3 color channel since all color can made from mixture of this 3 primary colors.

Here yellow color of pixel(2,2) have pixel value is in format of RGB (255,255,0) which is stored in 3d matrix format.

Colored imaged is in 3d matrix format each color has 1 channel all are superimposed to form one image.

Each color channel have different pixel value and superimposed pixel value will create shown image.

Following images show how colored image matrix looks like.

This is superimposed or final 3d matrix of color image where each pixel ranges from (0–255) and we have 3 such channels.

So let’s see an example of a colored image, this is an image of a dog-

This image is composed of many colors and almost all colors can be generated from the three primary colors- Red, Green, and Blue. We can say that each colored image is composed of these three colors or 3 channels- Red, Green, and Blue-

This means that in a colored image the number of matrices or the number of channels will be more. In this particular example, we have 3 matrices- 1 matrix for red known as Red channel-

another metrics for green known as the Green channel-

and finally a matrix for the blue color also known as the Blue channel.

Each of these metrics would again have values ranging from 0 to 255 where each of these numbers represents the intensity of the pixels or you can say that the shades of red, green, and blue. Finally, all of these channels or all of these matrices are superimposed so the shape of the image, when loaded in a computer, will be-

where N is the number of pixels across the height, M would be the number of pixels across the width, and 3 is representing the number of channels, in this case, we have 3 channels R, G, and B. In our example, the shape of the colored image would be- 6 x 5 x 3 since we have 6 pixels across the height, 5 across the width and there are 3 channels present.

Feature extraction of images

Handling the third dimension of images sometimes can be complex and redundant. In feature extraction, it becomes much simpler if we compress the image to a 2-D matrix. This is done by Gray-scaling or Binarizing. Gray scaling is richer than Binarizing as it shows the image as a combination of different intensities of Gray. Whereas binarzing simply builds a matrix full of 0s and 1s.

So while doing CV task in machine learning you can do feature extraction simply by compressing i.e. converting them into grayscale or binary format.