How to Blur an Image on Android

Build an image-blurring algorithm

Feb 7 · 10 min read
Picture by Jean-Louis Paulin on Unsplash

There’s no blurring API provided by Android out of the box for developers to use. So, this blog shares the basic algorithm of how blurring an image is done, and also an optimized algorithm that could do that in linear time.

Note: If you like to learn some existing Android-provided image manipulation features, check out LightingColorFilter, ColorMatrix, and PorterDuff.

To do that, I explore manipulating the pixel myself to blur it. Let me share with you the algorithm for doing so, in the most basic way, and I’ll share the code for doing so (image below).

Blurring Image Basics

All digital images are made up of pixels, such as below.

To blur an image, the color of each pixel is changed, setting the new color of the pixel per the average color of the pixel surrounding it.

The above illustration shows a 9x9 surrounding pixels. I call this a Diameter of 3 pixels (since the width is 3 and the height is 3), and a radius of 1 (The radius is Diameter — 1 / 2), which is the number of pixels from the center of the image.

Averaging the surrounding of color for a pixel is done for every pixel, from left to right, top to bottom.

After all is done, the image becomes slightly blurred.

If we want a greater blur effect, we could expand to a greater radius, e.g. 2, then a greater blurring effect would be seen. An illustration radius of 2 is as below.

This is the basic blur which is also named Box Blur. Nonetheless, there are two issues with this approach.

  1. As you could see on the radius 20 blur, the image is relatively pixelated, or boxed. The average of surrounding color is not a good value to blur the image smoothly. Let’s revisit how to solve this later in a section below.
  2. The processing time increases in magnitude as the radius size increases. At radius 0, it uses about 40ms, but at radius 100, it uses 101seconds, which is more than 1.5 minutes just to blur the image!

Optimized Box Blur Computation

To solve the slow computation problem, there’s a smart way of getting the averaging boxes value, without averaging individually for each pixel. Instead, this could be done by having a two-pass through the image.

  1. Traverse through the pixel row-by-row to compute the average of the pixel with its horizontal neighboring pixel within the stated radius.
  2. After that, use the result from 1, traverse through the pixel column-by-column to compute the average of the pixel with its vertical neighboring pixel within the stated radius.

Believe it or not, the result for each of the pixels, after the two-pass above, is the same as averaging out the pixel with its surrounding given radius (two-dimensionally).

Other than the above trick, there is also a quick way to get the average of the long radius pixel.

  • As we understood, Average = Sum/Count.
  • If we could reduce the number of times we compute sum, then we could speed up the computation further.

Check out the below idea, you’ll notice we just need to sum up the first pixel. The second (and subsequent) pixel, we just need to minus the front pixel, and add the last pixel, which then eliminates the need for computing the sum.

To get a clearer illustration of both the tricks above, check out the below video.

Using this approach, regardless of the radius size, it is always flat within the 270-305ms range, as you can see in the chart below.

The blurriness is almost exactly the same as the basic box blur, except that as the radius increases, the optimized box blur is better, as it does average out the side pixel with its adjacent pixels, while the basic box blur still retains the unblurred pixel content.

The side issue has nothing to do with the algorithm, but more to do with the handling of the radius outside the cells. Nonetheless, the optimized box blur makes the side pixel computation much easier, and achieves better results.

Improve Blurriness

With the improved optimized box blur, we solve the processing time challenge. However, the blurred quality is still not good as it is pixelated (or has a boxed feel).

A much better way of blurring is called the Gaussian blur, which does not average evenly across the surrounding pixels, but uses a Gaussian matrix.

This matrix will ensure that the neighboring pixel closer to the pixel in calculation has a higher weight, while the highest weight goes to the pixel in calculation itself; while the pixel further away will get less weight.

The Gaussian matrix can be seen in the below diagram. This would ensure smoother blurring, given that the blurring of the image gets more affected by closer pixels than the pixels further away.

However, such computation is harder, and expensive, especially if we want to do this on mobile devices. To get a feel of the complexity, you could get the computation from the Gaussian kernel calculator.

Quick Workaround

Anyway, one simple way to do away with it without actually using the Gaussian matrix, is to perform the box blur multiple times. Below is an example of a box blurred using radius 20 for a one-time and two-time comparison.

The second time makes the image better than the first one. Nonetheless, it requires you to process two times and it might be over-blurred.

Stack Blur

To solve this further, a Google Art and Culture resident, Mario Klingemann, came up with a smart and quick way to efficiently compute, which is close to the Gaussian blur, which he named Stack blur.

To get a quick feel of it, you could play around with it in a JavaScript program as below.

Looking at the code itself seems complicated. So, I closely examined it and came up with an explanation of how it does things.

Optimized Box Blur Approach

First of all, it leverages the optimized box blur approach as described above closely, by having a two-pass operation, i.e.

  1. Horizontally, row-by-row compute the stack blur to produce a temporal result.
  2. Repeat the similar step as in 1, but instead do it vertically, column-by-column, with the source as the temporal result. The result produces a final blurred image.

I’ll not elaborate this further as it is exactly the same frame as the optimized box blur. Instead, I’ll describe what it does for each pass.

Stack blur single pass

The main difference between stack blur and box blur is, instead of getting the average color of the sum of the surrounding radius pixels, it adds itself more times than its neighboring pixels.

The further the neighbor pixel is from the computed pixel, the lesser it will be added to the sum.

As for how many times it will add its neighboring pixel, it is radius — distance — 1, where the distance is how far the neighbor pixel is from the targeted pixel.

This is more clearly illustrated below. Note that Pixel 4 is added most (i.e. radius + 1 times), while Pixel 3 and 5 are added 3 times, Pixel 2 and 6 are added 2 times, and Pixel 1 and 7 are added once only.

The denominator is computed using radius * (radius + 2) + 1.

Using this approach, it indirectly ensures the result is similar to what a 2D Gaussian matrix provides.

Optimized calculation

At first looks, the computation seems to be complex. However, if examined closely, the entire stack could be reused by the next pixel. Check out the below diagram, it’s pretty similar to the box blur long radius averaging technique.

In addition to that, both the Outgoing-items and Incoming-items also could be efficiently added, without re-extracting each of them for each pixel calculation.

This is done as shown below.

  • During computing for Pixel 4, we already prepare the next Pixel, i.e. 7.
  • Upon computing for Pixel 5, we remove pixel 1 from Outgoing-items, and move Pixel 4 from Incoming-items to Outgoing-items list. Then, finally add Pixel 7 into the Incoming-items list.

Both these techniques enable the stack flow to also get performed in linear time regardless of how long the radius is. In the code I wrote, I got ~450 ms regardless of radius length.

The result is shown below. Clearly, when compared with box blur:

  • Stack blur once at 20 radius already has a smoother blur.
  • Stack blur 2 times makes it more blurry, but the blurred flowers are still very much intact.

Another very clearly shown stack blur smoothness compared to box blur using radius 100. The box blur is lined and boxed-feeling.

Optimized Stack Blur

The stack blur code I wrote based on my understanding is not optimized for speed but for clearer understanding.

Victor Laskin wrote a multi-threaded friendly and optimized version in C++.

Enrique López-Mañas rewrote it in Java based on what Victor Laskin wrote, and put it into a library.

This optimized approach took ~50ms to blur the image, about 10x faster than the version I wrote above.

The speedup is helped by multi-threading and other optimizations.

The other clear optimization is by turning the denominator division done into a table of stackblur_mul and stackblur_shr. The value in this table is used so that the code doesn’t need to perform any actual division.

The table looks confusing at a glance. This is a little tricky to understand. So, let me explain them below.

private static final short[] stackblur_mul = {
512, 512, 456, 512, 328, 456, 335, 512, 405, 328, 271, 456, ......

private static final byte[] stackblur_shr = {
9, 11, 12, 13, 13, 14, 14, 15, 15, 15, 15, 16, 16, 16, 16, 17,

How the table works

From the code, we can see that after we perform the sum, instead of dividing it using the denominator as below:


It becomes:

( Sum * mul_sum ) >>> shr_sum

Each of the values in the table corresponds to the radius used. If we used radius 2, then mul_sum = 456 and shr_sum = 12.

As we know, X >>> Y is X / (2^Y), so the actual formula is:

( Sum * mul_sum ) / (2^shr_sum)

To prove:

Sum/Denominator == ( Sum * mul_sum ) / (2^shr_sum)
==> ( mul_sum * Denominator ) / (2^shr_sum) == 1

Let’s take a radius of 2, and verify:

  • When denominator = 5, mul_sum = 456 and shr_sum = 12.

When we plug in the number, it gets almost 1.

9 * 456 / 2^12 = 4104/4096 = ~1

To verify that each of the numbers in the table is correct, below is a snapshot of an Excel spreadsheet showing the value computed. All of them are ~1.

This enables a much faster computation of the stack blur value without needing division, with minimal error margin.

There might be other optimizations in the code, but they are much easier to identify. Hence, I will not elaborate further here.

Overall, from the image, the produced quality using the optimized stack blur and my stack blur, they are the same, but with a ~10x improvement time.


If you like something more mathematical about the blurring concept, you could refer to this website.

Thanks for reading.

Better Programming

Advice for programmers.


Written by

Passionate in learning and sharing mobile development and its adjacent technology. If you like to buy me a coffee visit Thanks!

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade