Image compression using singular value decomposition

Unsupervised Blog
Balabit Unsupervised
4 min readApr 11, 2016

Welcome back!

Last time we have introduced the SVD-decomposition and discussed the rank-k approximation of a matrix A. We have also seen a small example how it works, but we probably failed to convince the Reader why such suspicious-looking matrix manipulation would be useful outside of mathematics. To ease the doubt about the applicability of this mathematical tool, we are going to show an interesting application of the SVD-decomposition, namely image compression.

Image compression using SVD

Images are represented in a rectangular array where each element corresponds to the grayscale value for that pixel. For coloured images we have a 3-dimensional array of size n×m×3, where n and m represent the number of pixels vertically and horizontally, respectively, and for each pixel we store the intensity for colours red, green and blue.

What we are going to do is to repeat the low-rank approximation procedure that we introduced last week on a larger matrix, that is, we create the low-rank approximation of a matrix that represents an image for each colour separately. The resulting 3-dimensional array will be a good approximation of the original image, as we will see soon. The Ipython notebook with the following code snippets can be downloaded from this location.

For demonstration purposes we have chosen an image of the Castle hill in Budapest. The original image we would like to compress can be found below.

Castle hill, Budapest

The image can be imported into the notebook by using the PIL package. We have also normalized the intensity values in each pixel, that’s why we have divided the values by 255.

The next step is to separate the grayscale values for each color, that is, we work with 3 independent 2-dimensional matrices, and perform an SVD-decomposition for each colour separately. We have picked k=50, that is, we construct the best rank-50 approximations of the intensity matrices for each colour.

We can calculate how many bytes are needed to store the pieces of the rank-50 approximation compared to the size of the original image. Actually, we have achieved a compression rate 8% which means that only 8% of the original storage space is used for those matrices from which we can restore a matrix that is close to the original one (the closest among all matrices that have rank 50).

Well, the restored matrix can be close to the original image, but are we satisfied with the result? Good compression rate usually means quite bad quality, so let’s check what we have achieved.

Castle hill, restored image using rank-50 approximation

The buildings are recognisable, still, the approximate image is quite distorted and noisy. We have also repeated the experiments using k=10 and k=200. The resulting approximations can be found below. For the sake of completeness, the storage of the matrices for k=10 requires 62,080 bytes resulting in an impressive 1.6% compression rate (albeit the recovered image is nearly unrecognizable) whilst for k=200 the storage space is 12,561,600 bytes and the compression rate is 32.2%.

You may also would like try to compress several images using different values for k. You can do this by downloading the Ipython notebook and run the code on the images of your choice.

Castle hill, restored image using rank-10 approximation
Castle hill, restored image using rank-200 approximation

To sum up, the singular value decomposition is a very powerful mathematical tool with applications in many fields, like image compression that we have just covered briefly, but there are many others. SVD-decomposition is generally a good choice when one has to compress large dataset (that is, reduce their dimensions) in such a way that the inner structure and correspondence relations between the data points are preserved in some way. Principal component analysis is a good example, we might devote a post for it sometime.

Originally published at www.balabit.com on April 11, 2016 by Tamás Kurics.

--

--