[CV] 2. Image Processing Basic: Gaussian and Median Filter, Separable 2D filter

Published in

jun-devpBlog

8 min readNov 11, 2020

1. Recap

**1.1 correlation and convolution**

Let F be an image and H be a filter (kernel or mask). Then Correlation performs the weighted sum of overlapping pixels in the window between F and H.

Convolution has almost similar procedures except for the fact that it flips the filter H first. In the following image, instead of H, the image F is flipped. It seems inconsistent with what I just described. However, the statement still holds since the convolution operation is commutative, meaning as long as we first flip either filter H or image F then perform correlation, it is fine.

Figure 1. mathematical definition of correlation and convolution

2. Filters in more detail

2.1 Smoothing

Filtering has an effect of smoothing the given image as we’ve seen in the previous article, and as is shown in Fig 2.

Figure 2. Left panel: original image. Right panel: Image after filtering with the box filter, from [1], [5]

One drawback of applying the box filter is that it introduce the ringing artifacts, losing large portion of fine image detail.

While keeping the effect of smoothing in order to remove noises in images, another representative approach, the Gaussian kernel, has been applied.

2.2 Gaussian Smoothing

Gaussian kernel, as its name implies, has the shape of the function ‘Gaussian distribution’ to define the weights inside the kernel, which are used to compute the weighted average of the neighboring points (pixels) in an image.

Figure 3. Illustration of Gaussian. Left panel: 1D Gaussian distribution. Middle panel: General form of 2D Gaussian with zero mean. Right panel: depiction of Gaussian filter in grayscale (white = high, black = low value), from [1], [5]

In other words, each value in the Gaussian filter is from the zero mean Gaussian distribution. One thing we need to keep in mind is that the kernel size is dependent of standard devidation 𝛔 of Gaussian function:

Rule of thumb for kernel size = 2 ·⌈3𝝈⌉ +1

By setting the standard deviation 𝛔, we can control until what certain extent we smooth the image. In other words, the higher the standard deviation gets the stronger effect of smoothing effect the image has.

With python and numpy, we can easily build Gaussian kernel as follows:

After defining your Gaussian kerenl, DO NOT FORGET TO NORMALIZE! Since it plays a role of weighted averaging, the sum of weights must be one.

2.3 Why Gaussian kernel is a good choice?

So far we have taken a look at how the Guassian kernel is defined, but what makes Gaussian kernel better than the box filter?

2.3.1 Properties

Computationally efficient: larger filters (e.g. 2D) can be implemented using smaller 1D filters
Rotationally symmetric:
The degree of smoothing is determined by the given standard deviation σ
Very effective to filtering Gaussian noise (low-pass filter)
Essential when down-sampling images (Gaussian Pyramid)

Apart from first three obvious properties, which directly come from the characteristics of Gaussian distribution, let us focus on the last two properties.

2.3.2 Why Gaussian Filter is efficient to remove noise?

Fourier Transform

Before getting into the answer for this question, we need to know the Fourier transform first. The breif version of idea for Fourier transform is the decomposition of function (also called a signal) into its frequency components, meaning Fourier transform refers to representing the orignial signal (function of time = f(t)) in an associated frequency domain. This is well illustrated in the following figure.

**Figure 4. Purple: original periodic signal. Red to Green: decomposed original signal in different frequencies, from [6]**

One might ask, how do we extract the frequency information from a function? Of course it is not feasible to obtain such frequency representation directly from the original signal f(t). The basic assumption behind this is that any signal can be represented as a sum of periodic components (e.g. Trigonometric functions) with varying frequencies (amplitude and phase), as is shown below.

**Equation 1. Represented signal f by trigonometric functions. please refer to** **here** **for derivation in more detail**

Given the assumption, what Fourier transform does is to compute the contribution (which is called Frequency Coefficient) of a function (sine, cosine) with certain frequency to compose the original signal f.

Figure 5. Complete picture of Fourier transform operation, from [1]

To acheive this, it applies the inner product between two signals: original signal f and signal e^{-2𝜋 i w t}, whose Frequency coefficient of Frequency (w) is what we want to know.

**Equeation 2. Functional definition of Fourier Transform where frequency is w**

To apply the Fourier transform into the Filtering of Computer Vision, There are two major properties that we need to be aware of.

(1) Fourier transform of Gaussian is a Gaussian, and Fourier transform of Box filter is a sinc function

**Figure 6. Illustration of Fourier transformed Gaussian and Box filter, from [1]**

(2, IMPORTANT) Convolving two functions corresponds to the product of them in frequency domain.

Equation 3. Convolutional property of Fourier Transform

So much about Fourier transform, and now let’s finally get back to the first question: “why Gaussian filter is effective to remove Gaussian noise?”

Figure 7. 1D representation of pixel intensities in an image, from [2]

By comparing pixel-intensity graphs above, we can observe that the graph in the righthand side varies much more due to the additive Gaussian noise in pixel intensities of image f. This additive Gaussian noise introduces high frequencies (corresponds to low periods), indicating if we remove high frequency components in the noise-introduced image f, It is likely to retrieve the original image f.

And here comes why the Gaussian filter is better than box filter.

By using Eq 3, we know that convolution of two signal: 𝑓 and 𝐠 in time domain corresponds to the product of their fourier-transformed signals in frequency domain. This means, by designing filter 𝐠 whose shape is low-pass filter in frequency domain, which only has positive values for low frequencies and zero for high frequencies, and then convolving it with the given signal 𝑓, we can obtain noise-removed signal.

And as is illustrated in Fig 8, Gaussian filter is a better chose for 𝐠 as its fourier-transformed shape is the ideal low-pass filter, allowing only low frequencies to survive. The box filter, on the other hand, turns into the sinc function after Fourier transform, and it does not only suppressing high frequencies but also some low frequency components, which are responsible for general image structures and are supposed not to be lost.

The last property of Gaussian filter regarding Gaussian Pyramid that I have not gone through yet will be will be dealt with in the next article.

2.4 Non-Linear Filter

2.4.1 Median Filter

Median Filter is one of Non-linear filters, which is also used for smoothing. Its basic idea is to replace each pixel by the median of its neigboring pixels (pixels in the window).

By doing so, it removes some spikes introduced by noise: especially impulse and salt & pepper noise. This is because stand-alone noise pixels with extreme intensities like black and white cannot survive after median filtering. Another advantage of median filter is that it does not introduce new pixel values since it only re-use exisiting pixel values from window. Further, unlike other averaging filters, it remove noise without losing edge information as is illustrated below.

Figure 8. Visual comparison between Median filter and Mean filter, from [1], [2]

Last remark regarding Median filter is that it can be considered when Gaussian filter cannot solve the problem. Fig 9 shows a visual comparison between Gaussian and Median filter. We can observe that when the noise level is too high, although the amount of noise pixel decreases with increasing Gaussian filter size, they still exist in the image. Median filter, on the other hand, already remove most of noise pixels with 3 x 3 filter size. By applying larger filter size, Median filter further exclude noise pixels but it loses a lot of image-structure informatin and image details.

Figure 9. Visual comparison between Gaussian filter and Median filter, from [7]

3. More details about Filter

Let’s take a closer look into details that we overlook in previous chapters.

3.1 Why Gaussian kernel is computationally efficient?

More precisely, the statement must be why separable 2D kernels are computationally efficient? Let us start with what happens in 2D convolution if the chosen 2D kernel h is seperable.

As is proven above, convolving with 2D seperable filter is identicle to the convolution with two 1D filters.

Let’s consider number of computation for each step that 2D convolution requires. The filter size is 2k + 1 (see previous article), meaning the window size for 2D convolution is (2k + 1)². Now in comparison, consider two phases of 1D convolution. The filter size remains the same as 2k + 1, but this time the window size is also 2k + 1 as here the convolution is 1D. But we need to perform convolution two times. Therefore, the number of total computation for each step will be 2·(2k + 1). This indicates any separable 2D kernel can reduce its computation by performing two times of 1D convolution 2·(2k + 1), instead of one 2D convolution (2k + 1)².

And Gaussian kernel is separable 2D kernel. This is why it is computationally efficient.

3.2 Boundary issue when filtering

While performing convolution, there are three options we can choose regarding the convolution-output image size.

(1) Full: the output size is sum of an image f and kernel g (when at least one pixel overlaps between f and g)
(2) Same: the output size is same as image f (when at least the center of filter overlaps)
(3) valid: the output size matches neither image f nor kernel g