Compressing images using Linear Algebra

Gaurav Sharma
Analytics Vidhya
Published in
10 min readSep 23, 2020

We all have studied some degree of linear algebra in our high school and some of us even studied it rigorously in our graduation. But linear algebra is more power-full then we all thought. It comprises of algorithms/methods which helps a lot in the real world specially in the study and manipulation of images. Images are one of the highly used medium of communication in today’s digital and social world.

We all share thousands of images, memes etc. to our known ones. But when it comes to sharing large image files we need to compress it such that the size of image got reduced significantly keeping the image quality and content as good as possible. Images are represented as 3 dimensional(2 for height and width and 1 for channel) array/matrix of pixels, and we all know whenever matrix is coined linear algebra appears automatically.

This story is divided into following parts -

  • Linear algebra refresher
  • visualize what a matrix do to a vector
  • visualize what a matrix do to a shape
  • matrix factorization and SVD
  • Compressing small dummy image using SVD
  • Compressing big real image using SVD

Note: This story contains lot’s of images (20+), so if you are reading this story in slow internet or even in mobile devices then you may face rendering issues and may miss some of the images. Also in the mobile device sometimes the images are automatically stretched by medium. So, it’s recommended to read on laptop/desktop.

Linear Algebra refresher

The two core element of linear algebra are vector and matrix. Vector represents a point in euclidean space whereas matrix is a linear mapping which maps vectors from one space to other (both the spaces could be of same or different dimensions). Here I coined the term linear mapping, it means the mapping from one vector space to another respects the underlying (linear) structure of each vector space i.e., it preserves the linearity, (vector space is just an abstract representation). Mathematically, we write as

Here, L is the linear map also known as linear transformation.

This is all mathematical jargon, but what it really do geometrically? So, if there is a vector in euclidean space then it will either scale it, or rotate it, or do both sequentially. Let’s visualize what a matrix(linear mapping) do to a vector in euclidean space(R2 in this case).

Visualize what a matrix do to a vector

We will first create a random vector and random linear map.

Now, we will transform random vector u using linear map M, this will give us transformed vector u which we call as v.

So, we see that the linear map M do both scaling and rotation when applied to u.

If M would have done only scaling to u then u would be called as eigen vector of linear map M and the amount by which M scales u is called eigen value corresponding to that eigen vector. Also, if M would have done only rotation to u then M is called as rotation matrix or rotational mapping.

So, a linear map/matrix either scales or rotates or does both to a vector in euclidean space.

Visualize what a matrix do to a shape

We now visualize what a matrix/linear map do to a shape in euclidean space(R2 in this case).

We will take a unit circle centered at (2,2) and then apply linear map M to it and visualize how it got transformed.

circum_vectors are the vector points at the circumference of this unit circle. Now, we will transform these circum_vectors using M.

Now, let’s visualize how this mapping change the shape of circle.

So, we can see that this circle is transformed into an ellipse. This is because each vector point is scaled and rotated which result in this transformed ellipse. Every point on the circle has an one-to-one mapping with points on the ellipse as the mapping is linear.

Matrix factorization and SVD

Matrix factorization is the most power-full tool in linear algebra, it is used extensively not just in mathematics but also in the field of data science and machine learning. Recommender systems uses matrix factorization heavily. The idea is to factorize an unknown matrix into pieces whose properties are known to us or who are easy to manipulate. There are many matrix factorization available but the most popular and fundamental is Singular value decomposition (SVD). Any matrix no matter of what shape can be decomposed into three matrices using SVD.

We can decompose any linear transformation M into 3 transformations. These are -

  • first rotating using V_T (right singular vectors)
  • then scaling using S (singular values)
  • then again rotating using U (left singular vectors)

Here V_T, S, U are decomposed matrices of M. So, any linear map can be decomposed into these three fundamental transformations, and this decomposition is called Singular value decomposition (SVD).

svd3.jpg
svd2.png

Let’s apply SVD to the linear map M we used earlier.

Transforming using SVD

Now, we will transform above unit circle centered at (2,2) using these decomposed transformations in sequential manner and will see that indeed the end result is same as before.

Whether apply M directly or first apply V_T then S and then U, will always gives the same result. See below image,

Now, let’s plot these transformations individually.

So, V_T rotated the original circle, then S scaled this rotated circle and finally U again rotate this scaled circle. All are done sequentially.

So, we can see that indeed the end result is exactly the same. Note, these images are on different scale(see x ticks), this is done because if we plot all of them on same scale then circle will look very tiny.

Image compression using SVD

Now we will use this extremely powerful tool to compress images. We will see that how the outer product of u and v corresponding to few topmost singular values can approximate the original image really well while reducing the image size significantly.

Compressing dummy image

In this section we will take a very small 7x7 dummy image. We first apply SVD to it, then see all the rank 1 images corresponding to all singular values. Then we will construct best rank i image by cumulating all rank 1 images corresponding to singular values from 1 to i.

Below is the code for random 7x7 black-white image.

Note, if you want to construct this same image then do pass the same seed 42. Also, this image is of rank 6.

Rank of a matrix is defined as the number of independent rows/columns in that matrix. It is always less then the minimum of number of rows and columns in that matrix.

Let’s visualize it.

Now, we will decompose this image using SVD.

Below are all rank 1 images. These are constructed by taking the outer product of columns of U and rows of V_T and then multiplying/scaling by corresponding sigma value in S.

We will use these rank 1 images to construct the image close to original image. We do this by cumulative sum of rank 1 images. Also by doing this we increase the rank of the matrix and hence information stored i.e., the constructed image will resemble more close to the original image as we cumulate more and more rank 1 images.

Note: At any step that approximated image is the best possible approximation of that rank for the image.

Best rank 1 approximation

CodeCogsEqn%282%29.gif

The best rank 1 image is nothing but just the outer product of u1 with v1 and scaled by s1.

Best rank 2 approximation

CodeCogsEqn%283%29.gif

The best rank 2 image is the sum of above best rank 1 image and outer product of u2 with v2 and scaled by s2.

Best higher rank approximation

Similarly, the best rank i image is the sum of best rank (i-1) image and outer product of ui with vi scaled by si.

We will construct best rank i image by cumulating all rank 1 images corresponding to singular values from 1 to i.

CodeCogsEqn%284%29.gif

We can see as we add more and more rank 1 images the approximated image get more and more close to original image and finally we get the original image.

If the rank of image is r then adding all r rank 1 images gives us the same original image.

But we won’t go till rank r after all it will give us no benefit and even require more space. So we stop earlier, in this case most of us will stop at best rank 4 or 5 approximation, because it gives an image close enough to the original image so that we get an good idea of what the image is. Also in this case all of us will stop at best rank 6 approximation, because rank_7 is same as rank_6. This is because, as the image is rank 6, the last singular value is 0, so it will not add anything to the image.

Let’s see how much space we saved by using best rank 3 image.

So, even in the case of extremely tiny 7x7 image we saved around 8% space with a rank 3 approximation. Although this saving is not a lot as we are also compromising with the image quality, but in case of real life bigger images we save a lot space keeping the image quality almost same.

Compressing real image

For simplicity I am taking only gray-scale image. But the method can be easily mapped to colored images as well.

This image is very large in resolution so I am resizing it to smaller size.

Rank of the image is 300 (which is a lot). Let’s decompose it.

Below are some rank 1 images corresponding to the top singular values.

Clearly we are not able to guess anything from these rank 1 images but magic will happen when we start cumulating them.

We would require some helper functions.

As the SVD from numpy linalg package gives S not as a matrix but as array of values, so I defined a function which gives S as a diagonal matrix.

perc_storage: Returns the percentage of storage taken by the compressed image.

perc_energy: Returns the percentage of energy(sum of sigmas) taken by the compressed image.

As we need to decide the rank of image upto which we want to compress, but this can’t be done manually as this vary image to image. So, we will use the below functions for that, which returns optimal rank for given energy and storage.

get_optimal_rank_by_energy: Gives the lowest rank for the given amount of energy.

get_optimal_rank_by_storage: Gives the lowest rank for the given amount of storage.

Let’s decompose the image using SVD.

We will find the optimal rank using above defined functions.

So we get the optimal rank for 85% energy and 30% storage.

Now let’s visualize compressed images of different ranks and compare them with original image.

We can see that rank 50 is a very good approximation, as it contains 75% information content keeping the storage requirement below 30%. So, we get 75% information by reducing 71% space, which is a lot saving keeping the image content very enriched.

Many of us will stop at rank 50 approximation, some of us may go till rank 75 approximation and those who wants more clear image may go till rank 100. There is always a trade-off between information retained and space required. It depends on your needs, if you want more quality image then you may go to higher rank approximations providing more image space but if space is limited to you then you will stop at initial ranks retaining maximum information in given space limits.

Below is the plot showing the trade-off of rank with different attributes.

You will get the entire documented jupyter notebook for this story from here, you just need to fork it. Also if you like the notebook then up-vote, it motivates me for creating further quality content.

If you like this story then do clap and also share with others.

Also, have a read of my other stories which includes variety of topics including,

and many more.

Thank-you once again for reading my stories my friends :)

--

--