Dual Blur and Its Implementation in Unity

6 min readSep 7, 2022

Dual Blur (Dual Filter Blur) is an improved algorithm based on Kawase Blur. This algorithm uses Down-Sampling to reduce the image and Up-Sampling to enlarge the image to further reduce the texture reads while making full use of the GPU hardware characteristics.

First, down-sampling the image, and reduce the length and width of the original image by 1/2 to obtain the target image. As shown in the figure, the pink part represents one pixel of the target image after being stretched to the original size, and each small white square represents one pixel of the original image. When sampling, the selected sampling position is the position represented by the blue circle in the figure, which are: the four corners and the center of each pixel of the target image, the weights are 1/8 and 1/2, and their UV coordinates are substituted into Sampling from the original image. One pixel of the target image is processed, and the texture is read through 5 times, so that 16 pixels of the original image participate in the operation. The number of pixels of the obtained target image is reduced to 1/4 of the original image. Then perform multiple down-sampling, and the target image obtained each time is used as the original image for the next sampling. In this way, the number of pixels that need to be involved in the operation for each down-sampling will be reduced to 1/4.

Then perform up-sampling (Up-Sampling) of the image, and expand the length and width of the original image by 2 times to obtain the target image. As shown in the figure, the pink part indicates that the target image is reduced to one pixel of the original image size. Each small white square represents a pixel of the original image. When sampling, the selected sampling positions are the positions represented by the blue circles in the figure, which are: the four corners of the corresponding pixel of the original image and the center of the four adjacent pixels, and the weights are 1/6 and 1/12 respectively. One pixel of the target image is processed, and 8 textures are read, so that 16 pixels of the original image participate in the operation. The number of pixels of the obtained target image is expanded to 4 times that of the original image. In this way, the up-sampling operation is repeated until the image is restored to its original size, as shown in the following figure:

Unity Implementation

According to the above algorithm, we implement the Dual Blur algorithm on Unity: choose 4 down-sampling and 4 up-sampling for blurring.

Down-sampling Implementation:

float4 frag_downsample(v2f_img i) :COLOR
{
float4 offset = _MainTex_TexelSize.xyxy*float4(-1,-1,1,1);
float4 o = tex2D(_MainTex, i.uv) * 4;
o += tex2D(_MainTex, i.uv + offset.xy);
o += tex2D(_MainTex, i.uv + offset.xw);
o += tex2D(_MainTex, i.uv + offset.zy);
o += tex2D(_MainTex, i.uv + offset.zw);
return o/8;
}

Up-sampling Implementation:

float4 frag_upsample(v2f_img i) :COLOR
{
float4 offset = _MainTex_TexelSize.xyxy*float4(-1,-1,1,1);
float4 o = tex2D(_MainTex, i.uv + float2(offset.x, 0));
o += tex2D(_MainTex, i.uv + float2(offset.z, 0));
o += tex2D(_MainTex, i.uv + float2(0, offset.y));
o += tex2D(_MainTex, i.uv + float2(0, offset.w));
o += tex2D(_MainTex, i.uv + offset.xy / 2.0) * 2;
o += tex2D(_MainTex, i.uv + offset.xw / 2.0) * 2;
o += tex2D(_MainTex, i.uv + offset.zy / 2.0) * 2;
o += tex2D(_MainTex, i.uv + offset.zw / 2.0) * 2;
return o / 12;
}

Implement the corresponding pass:

Pass
{
ZTest Always ZWrite Off Cull Off
CGPROGRAM
#pragma target 3.0
#pragma vertex vert_img
#pragma fragment frag_downsample
ENDCG
}
Pass
{
ZTest Always ZWrite Off Cull Off
CGPROGRAM
#pragma target 3.0
#pragma vertex vert_img
#pragma fragment frag_upsample
ENDCG
}

Repeat down-sampling and up-sampling in OnRenderImage:

private void OnRenderImage(RenderTexture src, RenderTexture dest)
{
int width = src.width;
int height = src.height;
var prefilterRend = RenderTexture.GetTemporary(width / 2, height / 2, 0, RenderTextureFormat.Default);
Graphics.Blit(src, prefilterRend, m_Material, 0);
var last = prefilterRend;
for (int level = 0; level < MaxIterations; level++)
{
_blurBuffer1[level] = RenderTexture.GetTemporary(
last.width / 2, last.height / 2, 0, RenderTextureFormat.Default
);
Graphics.Blit(last, _blurBuffer1[level], m_Material, 0);
last = _blurBuffer1[level];
}
for (int level = MaxIterations-1; level >= 0; level–)
{
_blurBuffer2[level] = RenderTexture.GetTemporary(
last.width * 2, last.height * 2, 0, RenderTextureFormat.Default
);
Graphics.Blit(last, _blurBuffer2[level], m_Material, 1);
last = _blurBuffer2[level];
}
Graphics.Blit(last, dest); ;
for (var i = 0; i < MaxIterations; i++)
{
if (_blurBuffer1[i] != null)
{
RenderTexture.ReleaseTemporary(_blurBuffer1[i]);
_blurBuffer1[i] = null;
}
if (_blurBuffer2[i] != null)
{
RenderTexture.ReleaseTemporary(_blurBuffer2[i]);
_blurBuffer2[i] = null;
}
}
RenderTexture.ReleaseTemporary(prefilterRend);
}

Obtaining the result:

You can see the process through FramDebug:

(Second Down-Sampling)

(Fourth Down-Sampling)

(Second Up-Sampling)

Summary

The Dual Blur algorithm, based on Kawase Blur, adopts down-sampling to reduce the image and up-sampling to enlarge the image. By reducing the number of pixels of the processed image, the number of texture sampling is further reduced, thereby improving the efficiency.

The test results in the sharing of the ARM team’s “Bandwidth-Efficient Rendering” are as follows:

In experiments on mobile devices equipped with Mali-T760 MP8 (such as Samsung Galaxy S6), we can see that the Dual Blur algorithm has the fastest speed and the best performance.

There is not much difference in read&write bandwidth between Blur’s improved algorithms.

In summary, it can be said that the Dual Blur algorithm is the most efficient algorithm to achieve blur effect at this stage.

Screen Post Processing Effects Series

Screen Post Processing Effects Chapter 4: Kawase Blur and Its Implementation

Screen Post Processing Effects Chapter 4: Box Blur and Its Implementation

Screen Post Processing Effects Chapter 3: Algorithm of Gaussian Blur and Implementation Using Linear Sampling

Screen Post Processing Effects Chapter 2: Two-Step One-Dimensional Operation Algorithm of Gaussian Blur and Its Implementation

Screen Post Processing Effects Chapter 1 — Basic Algorithm of Gaussian Blur and Its Implementation

That’s all for today’s sharing. Of course, life is boundless but knowing is boundless. In the long development cycle, these problems you see maybe just the tip of the iceberg. We have already prepared more technical topics on the UWA Q&A website, waiting for you to explore and share them together. You are welcome to join us, who love progress. Maybe your method can solve the urgent needs of others, and the “stone” of other mountains can also attack your “jade”.

Dual Blur and Its Implementation in Unity

Unity Implementation

Summary

Screen Post Processing Effects Series

Written by UWA

No responses yet