Review — SFA: Simplified-Fast-AlexNet (Blur Classification)

Blur Image Classification Using Simplified AlexNet

Sik-Ho Tsang

Published in

Nerd For Tech

6 min readMay 22, 2021

In this story, Blur Image Classification based on Deep Learning, (SFA), is reviewed. In this paper:

Simplified-Fast-AlexNet (SFA) is designed to classify if an image is blurred by defocus blur, Gaussian blur, haze blur, or motion blur.

This is a paper in 2017 IST. (Sik-Ho Tsang @ Medium)

Outline

Brief Overview of Image Blur Modelling
Simplified-Fast-AlexNet (SFA): Network Architecture
Datasets
Experimental Results

1. Brief Overview of Image Blur Modelling

Image blur issue can be regarded as the image degradation process from the high-quality images to the low-quality blurred images:

where F denotes the degraded image, f is the lossless image, h remarks the blur kernel a.k.a. the point spread function (PSF), * means the convolution operator, and n indicates the additional noise, here, n is the Gaussian white noise.

1.1. Gaussian Blur

In many practical applications, such as remote sensing and satellite imaging, Gaussian kernel function was regarded as the kernel function of atmospheric turbulence:

in which, σ is the kernel radius, R is the support region usually met the 3σ criteria.

1.2. Motion Blur

Motion blur is another blur to be considered, which is caused by the relative linear motion between the target and camera:

where M denotes the length of motion in pixels and ω indicates the angle between motion direction and the x-axis.

1.3. Defocus Blur

Defocus blur is the most common to be seen in daily life and it can be modeled by the cylinder function:

where r demonstrates the blur radius which is proportional to the extent of defocus.

1.4. Haze Blur

Haze blur is caused by the interference of natural fog. In this paper, haze blur is not simulated by any PSF, due to that enormous samples are existed in real life and easy to be collected for experiment applications.

2. Simplified-Fast-AlexNet (SFA): Network Architecture

There are 5 convolution layers and 1 full connected layer.
The output number of each convolution layer of AlexNet is proportional compressed by the ratio of 0.5. The reason for doing this is that, the four blur type classification is a relative simple task comparing the thousands of image categories in 2012 ImageNet classification competition.
On the other hand, the first two FCs are removed from the original model of AlexNet to enhance the speediness and real-time performance because more than 80 percent parameters are stored in FCs.
Batch normalization is used at layers 1, 2 and 5 instead of the original local response normalization.
Input: The size of the input images is 227×227×3.
1st layer: Conv_1: 48 kernels of size 11×11, stride of 4 pixels and pad of 0; MaxPool_1: kernel of size 3×3, stride of 2 pixels and pad of 0. The 48×27×27 feature maps are obtained as the output.
2nd layer: Conv_2 are using kernel of size 5×5, stride of 1 pixel and pad of 2 pixels; MaxPool_2: kernel of size 3×3, stride of 1 pixel and pad of 0.
3rd layer: Conv_3: kernel of size 5×5, stride of 1 pixel and pad of 2 pixels.
4th layer: Conv_4 is: kernel of size 3×3, stride of 2 pixels and pad of 0.
5th layer: Conv_5: kernel of size 3×3, stride of 1 pixel and pad of 1; MaxPool_5: kernel of size 3×3, stride of 2 pixels and pad of 0.
6th layer: Fully connected layer and ReLU.
Hence, the data flow of the different hidden layers of SFA is as follows: 227×227×3 > 27×27×48 > 13×13×128 > 13×13×192 > 13×13×192 > 6×6×128 > 1×1×4.
Caffe is used.

3. Datasets

3.1. Training Dataset

200,000 128×128×3 global blur patches are used for training.
In brief, these patches are cropped from synthetic Gaussian blur, motion blur, and defocus blur applied on the Oxford building dataset and Caltech 101 dataset, as well as cropped from real haze blur images gathered from online websites.

3.2. Testing Dataset 1

Berkeley dataset 200 images and Pascal VOC 2007 dataset are selected to be the testing dataset.
In total 22,240 global blur test sample patches are obtained in which 5560 haze blur image patches possess the same sources with training samples.

3.3. Testing Dataset 2

A dataset consisting of 10,080 natural global blur image patches is constructed. The samples are all collected from the same websites as the haze blur samples in Training dataset.

4. Experimental Results

4.1. Loss Curves & Accuracy Curves

**Loss Curves & Accuracy Curves of** **AlexNet** **and SFA**

Though the details of two model, AlexNet and SFA, are different, both the losses and accuracy are reach the similar value, which indicates the performance of two models are equivalent in terms of the classification accuracy standard.

4.2. Comparison with AlexNet

P_N: The model parameter numbers.
L_N: The model depth.
F_T: The forward propagation time of the single image.
B_T: The error backward propagation time of a single image.
CLF_T: The time of identify a single image.
Tr_T: The model training time.
Error: The classification error rate over the testing dataset1.

P_N of AlexNet is approximately 1000 times of SFA.
CLF_T of SFA is 0.5s economy than AlexNet, which indicates the SFA is more suitable in practical applications.
The total training time of SFA is less than one day, yet, the AlexNet requires about two days.
The classification error rate of SFA is only 0.0105 greater than AlexNet.

4.3. SOTA Comparison

The classification accuracies of two-step way [4], single-layered NN [8] and DNN [9] are come from the original articles. (It is strange since the dataset is different. But it is understandable that re-implementation may not be possible.)
Accuracy1 is test on the testing dataset1 and Accuracy2 is test on the testing dataset2.
The prediction accuracy (>90%) of learned feature-based methods is generally superior to the ones (<90%) whose features are handcrafted.
The classification accuracy of SFA on simulated testing dataset is 96.99%, which is slightly lower than AlexNet of 97.74%, nevertheless, it is still better than DNN model of 95.2%.
In addition, the best performance of SFA on natural fuzzy datasets is 93.75%, slightly lower than that of 94.10%, however, the speediness and real-time performance of SFA is significantly better than AlexNet.

Reference

[2017 ISA] [SFA]
Blur Image Classification based on Deep Learning

Blur Classification

[SFA]