Teaching to a computer what pictures are

Raffaello Ippolito
4 min readJul 31, 2023

--

Images have always been fundamental to man, who since the earliest of times has used them to remember, illustrate, communicate, etc. For this reason, trying to make the best use of them is even nowadays a matter of great interest. There are many digital processing methods commonly known as “filters” the purpose of which is to modify the image, extracting elements from it, hiding them or making them stand out, so that it can be improved to make it the most useful possible for the required purposes.

What is an image?

To process an image we must first find a way to make it manageable by a computer, so let’s see how then how we can encode an image.

What is the most precise definition you would be able to give of an image? It is not something we are used to explain, everyone knows what an image is! Well computers don’t, they are machines after all. If I had to give a definition I would say, “An image is a ‘two-dimensional object of finite size defined by a succession of colors” to the eyes of a human being the succession of colors happens continuously, for a computer however it is difficult to relate to the power of the continuous the solution is therefore to discretize the problem by dividing the image into pixels and assigning a color to each pixel.

What about colors?

Okay, we said that we can divide an image into pixels thus creating a grid, each pixel can then be identified by a coordinate pair using this grid. At this point we associate each coordinate pair (i.e. each pixel) with a color and we are done! Right? Hmm, but what is a color? We have to deal with defining these as well.

Let’s start with the simplest images we can find, i.e., those in black and white (which is different from grayscale!), in such an image each pixel can be white or black, so all I need is a boolean variable, let’s say 0 for black and 1 for white. And now we have an image!

Black and white images were fine in 1942, when the first digital images were born, but such a low level of detail turns out to be of limited usefulness, perhaps a gray can make the image clearer, ok then let’s do 0 for black, 1 for gray and 2 for white. Much better, shall we add one more? 0 for black, 1 for dark gray, 2 for light gray and 3 for white. Since we are here we might as well add other 252 shades of Gray, so let’s start with 0 which is Black, oridinate the shades of gray by brightness level in ascending order and assign them numbers from 1 to 254, until we get to 255 which is White, here is a grayscale image!

We can interpret this color identifier number as the brightness level of the pixel, or if you want, as the “amount of white” in the pixel i.e., as the number that answers the question, “on a scale of 0 to 255 how white is this pixel?” Someone then thought to ask, “on a scale of 0 to 255 how green is this pixel? How red is it? how blue is it?” the combination of different levels of these three colors is able to form any color, we then decide to use no longer a number but a triplet of numbers, thus we obtain a color image by rgb encoding.

Conclusions

So what is an image for a computer? An image is an ordered set of pixels, with each pixel, identified by a coordinate pair, being associated with a color, which is instead identified by a triplet of integers. Depending on the needs one can expand this encoding by adding channels such as image transparency, the color will now be identified by a set of 4 numbers. The encoding methodology shown in this article is clearly not the only one possible, there are for example color images with ckmyo encoding still vector images that are based on a totally different concept.

I have written this article trying to make it accessible to anyone, I must confess however that as a mathematician the definition I would have given of an image is simply that of I : R² → R³ meaning a function that given a pair of numbers (the coordinates) returns a triplet of numbers (identifying the color) the idea of an image as a function is of relevant importance in the theory of digital image processing, but we will discuss this in future articles.

--

--

Raffaello Ippolito

Italian software developer and data analytics student. Graduated in Mathematics for Engineering talking about Big Data and Image Processing