Noise removal in images using deep learning models

sunil Belde
Analytics Vidhya
Published in
8 min readApr 23, 2021

Table of Contents:

  1. Overview of the problem.
  2. Usage of Deep Learning models.
  3. Data collection and preparation.
  4. Performance metric.
  5. First cut approach.
  6. Experimentation with different models.
  7. Model Quantization.
  8. Model Analysis.
  9. Building a web application.
  10. Future work.
  11. References.

1. Overview of the problem

Image denoising is the process of removing the noise from an Image. Addition of noise will create loss of information. The noise can be originated by many ways such as while capturing images in low-light situations, damage of electric circuits due to heat, sensor illumination levels of a digital camera or due to the faulty memory locations in hardware or bit errors in transmission of data over long distances.

what is noise?

An additional unnecessary pixel values are added to a image causing the loss of information.The noise can be of various types like -

Impulse Noise (IN) where the pixel values are completely different from the surrounding pixel values. Impulse Nose is of two types i.e., salt-and-pepper impulse noise (SPIN) and random valued impulse noise (RVIN).

Additive White Gaussian Noise (AWGN), where each pixel in the image will be changed from its original value by a small amount.

2. Use of Deep Learning models

It is essential to remove the noise and recover the original image from the degraded images where getting the original image is important for robust performance or in cases where filling the missing information is very useful like the astronomical images that are taken from very distant objects.

Convolutional neural networks works well with images. We try multiple deep neural network architectures that were mentioned in some of the research papers and compare the results of each model.

3. Data collection and preparation

We will be using publicly available images and modify it according to our requirement.

Data source : https://github.com/BIDS/BSDS500

This Dataset is provided by Berkeley University of California which contains 500 natural images.

We split this 500 images into 400 train images and 100 test images.

Now we create patches out of these images with patch size of 40 x 40 ,stride of 40 and different crop sizes.After doing so we got 85600 patches for train and 21400 patches for test data.

patches from the given image can be obtained by using below code :

4. Performance metric

PSNR is the most used metric to measure the quality of image obtained out of noise compression.

The term peak signal-to-noise ratio (PSNR) is an expression for the ratio between the maximum possible value (power) of a signal and the power of distorting noise that affects the quality of its representation. PSNR is usually measured in logarithmic decibel scale.

Given a ground truth image (g) ,noisy image (f) PSNR can be calculated by

Where MSE is given by:

5. First cut approach

In first cut approach we will create input pipelines which take patches data as input and adds some random noise to it, with these noisy patches we will be training a simple convolutional autoencoder model using tensorflow keras.

Code for the input pipelines using tf.data :

Now we will train a autoencoder model with Mean Square Error (MSE) as loss function and Adam of initial learning rate 1e-03 with small decay as Optimizer.

With autoencoder we attained train loss of 0.0020 and test loss of 0.0021

6. Experimentation with different models

We will try different deep learning architectures which are used for image denoising task.

6.1. DNCNN

Research paper : https://arxiv.org/pdf/1608.03981v1.pdf

Architecture :

given noisy image input ‘y’ the model predicts residual image ‘R’ and we can get clean image ‘x’ by doing x=y-R

The model consists of three type of layers with total depth of D :

(i) Conv+ReLU: for the first layer, 64 filters of size 3 x 3 x c are used to generate 64 feature maps, and rectified linear units (ReLU) are then utilized for nonlinearity. Here c represents the number of image channels, i.e., c = 1 for gray image and c = 3 for color image.

(ii) Conv+BN+ReLU: for layers 2 *(D -1), 64 filters of size 3 x 3 64 are used, and batch normalization is added between convolution and ReLU.

(iii) Conv: for the last layer, c filters of size 3 x 3 x 64 are used to reconstruct the output.

This model has mainly two features i.e, residual learning formulation to learn ‘R’ and Batch Normalisation which speeds up the training as well as improve performance of denoising.

Model has been trained for 30 epochs with Adam optimizer of learning rate=0.001 and with learning rate decay of 5% per epoch and Mean Squared Error (MSE) is used as loss function.

plotting denoised patches :

Now we combine all the denoised patches of an image to get complete image.We can do that by below code :

Residual Learning :

After learning this residual image we will subtract this from input.So, We have added a subtract layer at the end of the model to get denoised image as output.

6.2. RIDNET

Research paper : https://arxiv.org/pdf/1904.07396.pdf

Architecture :

This model is composed of three main modules i.e. feature extraction, feature learning residual on the residual module, and reconstruction, as shown in Figure.

Enhancement Attention Modules (EAM) uses a Residual on the Residual structure with local skip and short skip connections. Each EAM is further composed of D blocks followed by feature attention. Due to the residual on the residual architecture, very deep networks are now possible that improve denoising performance.

Model has been train for 20 epochs with Adam optimizer of learning rate=0.001 and with learning rate decay of 10% per epoch and Mean Absolute Error (MAE) is used as loss function.

plotting denoised patches :

Plotting the images constructed from patches :

6.3. Comparison of models:

Tabulating the performance (PSNR in db) obtained by the models on an image with different noise levels performed :

7. Model Quantization

Quantization for deep learning is the process of approximating a neural network that uses floating-point numbers by a neural network of low bit width numbers. This dramatically reduces both the memory requirement and computational cost of using neural networks.

After quantization the size of the DNCNN model is reduced from 7 MB to 2 MB and RIDNET model is reduced from 18 MB to 6 MB.

This quantized models is very efficient where we have computational constraints. These models performance will be as close to the original models .

Here are some statistical values which shows the performance of quantized and original models of DNCNN and RIDNET :

By above observation we can see that RIDNET is getting slightly higher PSNR values than DnCNN model
But time taken by the RIDNET model is very much higher and size of the model is also high.

8. Model Analysis

We perform some model analysis on DNCNN model.

8.1. Different noise levels :

We can observe that model is performing well on the images with noise levels in range of 10–35. As the noise levels are increased there is very little improvement in the PSNR.

Above noise level 60 it is difficult for the model to reconstruct the image from the given noisy image.

8.2. Performance on different images :

A plain image where noise can be easily observed getting high PSNR.
An image with a single background also having good PSNR.
The complexity of image increase in terms of pixel distributions ( lot of color variations) getting low PSNR comparatively.
Lot of color variations in the image at each area has decreased PSNR.

The model tries to predict the residues in image by looking at the pixels that are distributed around a pixel.
We can see the model performance decreases with increase in complexity of image in terms of color variations and pixel distributions. However this impact is negotiable as we are reconstructing image by predicting on smaller patches of image.

9. Building a web application

The entire project is deployed using streamlit.

Link to running application : https://share.streamlit.io/sunilbelde/imagedenoising-dncnn-ridnet-keras/main/app.py

Demo of running application :

10. Future work

Currently these deep learning models are trained on images with Additive white Gaussian noise (AWGN) only.

In future we will try to use the images with noise like Impulse noise (IN) , salt-and-pepper impulse noise (SPIN) and random valued impulse noise (RVIN). We will train the models with architectures which better suits for this types of noises.

— — — — THANK YOU FOR READING — — — —

--

--