un·rav·el (ŭn-răv’əl) v. un·rav·eled un·rav·el·ing un·rav·els ~ To clarify the elements of something mysterious or baffling
I has been a while since my first post, for which I chose to write about the relatively unfamiliar Floyd-Steinberg dithering algorithm. Most probably my post would have been read by more people had it been called How I used deep learning to become a successful investor or so… To receive at least some readers, I decided to make the dithering as practically oriented and accessible as possible, which seems to have worked! Viewer stats were far better than predicted, so may be my next post should be something Bayesian. Nevertheless, in my sincere enthusiasm to show you the elegance of the algorithm of Robert W. Floyd and Louis Steinberg, I did skip over some very interesting statistical details. Such a shame! In this article I truly hope to compensate for this deficiency by explaining something about permutations and how we can employ them to pick the optimal color palette. Let’s unravel!
Image dithering: a quick recap
To recapitulate, here is a super small summary of my previous post: we exploited the Floyd-Steinberg algorithm to create an attractive representation of Quokka Quinten, despite using a 3-bit color palette only. Graphically, it turned out to look like this:
It is without doubt that the dithering algorithm provides a way to immensly improve on the plain 3-bit representation. Especially when looking superficially, the result is quite dazzling! However, taking a closer look reveals considerable artefacts and we can even see some green and red dots. A possible solution is to expand our color palette to for example 6-bit, which means being able to choose from 2⁶ = 64 colors. It should be no surprise that we substantially ameliorate the image, since we have eight times as much colors to our disposal. Since a picture says a thousands words, here is the story depicted graphically:
It goes without saying that using increasingly extensive color palettes would yield increasingly pretty images as well, but it won’t surprise you that it comes at certain costs (e.g., computation times, data storage). Well, sounds like we have come across an interesting trade-off! One possible strategy is to just eyeball some images that are dithered using different-sized color palettes, look at their storage size and decide on the optimal number of colors to use. There certainly is a better — and most important: statistically more sound — way to tackle this problem. For this we exploit a basic understanding of three concepts: loss function, permutation and an objective criterion for ‘costs’ — the Gap statistic.
Composing a loss function
The first thing that pops up in mind when we talk about any optimization problem, is the notion of a loss function. One of the most well-known loss functions in machine learning is the Mean Squared Error (MSE) for regression problems and Cross Entropy for classification tasks. For the Floyd-Steinberg algorithm we need can’t use those existing functions, so we have to be a little bit more creative. Brace yourselves…
Let R be the set of row indicators of an image (except the first row) and C be the set of column indicators (except the first and last column). The diffused error eᵢⱼ for pixel xᵢⱼ can now be defined as
with 1(ij) indicator functions for either the rows (R) or columns (C) and cₖ a particular color from the K color palette. Now, with the diffused error eᵢⱼ made explicit, we define ĉₖᵢⱼ as the corresponding color to pixel xᵢⱼ that minimizes the Euclidean norm of eᵢⱼ. In formula:
If you don’t follow the math, don’t worry! So far, we only wrote down in a fancy way that ĉₖᵢⱼ is the nearest color in the color palette for pixel xᵢⱼ. This formalisation, together with the fact that we can use ĉₖᵢⱼ instead of cₖ in the formula for eᵢⱼ, yields our loss function:
with X representing the set of all pixels and Cₖ a K×3 matrix, where in each row one of the K color vectors is stored. In other words, the total loss is defined as the sum of squared errors!
Alright, let’s repose for now and let the math sink in… Time to turn our heads to a somewhat more digestable topic: permutation.
In a global sense, permutations are (random) rearrangements of objects, and are widely used in statistics to perform for example significance tests. Now, within this specific image dithering context, we may assume Quokka Quinten to be a random rearrangement of pixels. This results in a rather noisy picture:
The reason that such a picture might be useful, is that it serves as another tool in formalizing the optimization as a statistically well-defined problem. Suppose we refer to this random collection of pixels as the null hypothesis (H₀). Under this assumption, our sparkling picture of Quinten is just as likely as all other N! permuted versions of the picture, with N the number of pixels. We can also apply the Floyd-Steinberg algorithm on a permuted picture under H₀ for any color palette of our choice and calculate the value of the loss function. By repeating this process hundreds of times and averaging the results, we obtain an estimate of the average loss-value, or as statisticians like to say: we get an estimate of the expected loss under H₀. Now we are really getting somewhere, so let’s complete our puzzle by putting the final missing piece in place!
The Gap statistic
Permutation allowed us to obtain an estimates of the expected loss under H₀ for the different color palettes. The crux of the matter lies in the comparison of this expected losses with the loss values for our observed picture of Quokka Quinten. It should be of no surprise that the loss values will decrease for larger K (remember, K was the number of colors in our palette), but this is not necessarily true for the difference between the expected loss and the observed loss. It is exactly this difference that defines the Gap statistic:
Choosing the optimal color palette
With this Gap statistic, the problem of choosing the optimal number of colors in our palette can be formulated as the value of K that maximizes this Gap statistic:
The only thing that is left is to visualize the results by plotting the loss values and Gap statistics against the number of colors in our palette. The picture below shows the idea. We clearly see the decreasing trend for the loss values (left pane), and a nonlinear curve for the Gap statistic. Apparently, the Floyd-Steinberg dithering algorithm represents structured pictures such as our Quokka Quinten far better than random rearrangements of the same pixels.
Pfieuw… It has been a though journey to a reformulate our problem such that we were able to choose the optimal number of colors. I sincerely hope it did not distract from the fascinating way in which we can put statistics to work and help us solve complicated problems!
Full source code: https://github.com/justinkraaijenbrink/imagedithering