Member-only story
Non-Negative Matrix Factorization for Image Compression and Clustering
From scratch implementation of a versatile algorithm in Python
Well, you may ask yourself: What has image compression to do with clustering? And honestly, at a first glance reducing the file size of an image and clustering data points into distinct groups do not seem to have much in common. However, if you think about it for a moment you may realize that both tasks indeed share some common features. So to get started, let’s have a look at image compression first. Image compression essentially deals with the question: How can we reduce the size of an image file without compromising (too much) on the image quality? Now, a digital image is basically just a bunch of numbers in a matrix. With the dimensions of the matrix determining the resolution of the image. So instead of treating an image as the visual percept with all its content and meaning, let’s think about it as a numerical matrix.
As you may know, multiplying two matrices of different dimensions will produce a new, third matrix under the condition that the number of columns in the first matrix matches the number of rows in the second matrix. The resulting matrix product will have the number of rows from the first matrix and the number of columns from the second matrix.