Nerd For Tech
Published in

Nerd For Tech

Review — SFA: Simplified-Fast-AlexNet (Blur Classification)

Blur Image Classification Using Simplified AlexNet

Samples of Blurred Images

Outline

  1. Brief Overview of Image Blur Modelling
  2. Simplified-Fast-AlexNet (SFA): Network Architecture
  3. Datasets
  4. Experimental Results

1. Brief Overview of Image Blur Modelling

  • Image blur issue can be regarded as the image degradation process from the high-quality images to the low-quality blurred images:

1.1. Gaussian Blur

  • In many practical applications, such as remote sensing and satellite imaging, Gaussian kernel function was regarded as the kernel function of atmospheric turbulence:

1.2. Motion Blur

  • Motion blur is another blur to be considered, which is caused by the relative linear motion between the target and camera:

1.3. Defocus Blur

  • Defocus blur is the most common to be seen in daily life and it can be modeled by the cylinder function:

1.4. Haze Blur

  • Haze blur is caused by the interference of natural fog. In this paper, haze blur is not simulated by any PSF, due to that enormous samples are existed in real life and easy to be collected for experiment applications.

2. Simplified-Fast-AlexNet (SFA): Network Architecture

Simplified-Fast-AlexNet (SFA): Network Architecture
  • The output number of each convolution layer of AlexNet is proportional compressed by the ratio of 0.5. The reason for doing this is that, the four blur type classification is a relative simple task comparing the thousands of image categories in 2012 ImageNet classification competition.
  • On the other hand, the first two FCs are removed from the original model of AlexNet to enhance the speediness and real-time performance because more than 80 percent parameters are stored in FCs.
  • Batch normalization is used at layers 1, 2 and 5 instead of the original local response normalization.
  • Input: The size of the input images is 227×227×3.
  • 1st layer: Conv_1: 48 kernels of size 11×11, stride of 4 pixels and pad of 0; MaxPool_1: kernel of size 3×3, stride of 2 pixels and pad of 0. The 48×27×27 feature maps are obtained as the output.
  • 2nd layer: Conv_2 are using kernel of size 5×5, stride of 1 pixel and pad of 2 pixels; MaxPool_2: kernel of size 3×3, stride of 1 pixel and pad of 0.
  • 3rd layer: Conv_3: kernel of size 5×5, stride of 1 pixel and pad of 2 pixels.
  • 4th layer: Conv_4 is: kernel of size 3×3, stride of 2 pixels and pad of 0.
  • 5th layer: Conv_5: kernel of size 3×3, stride of 1 pixel and pad of 1; MaxPool_5: kernel of size 3×3, stride of 2 pixels and pad of 0.
  • 6th layer: Fully connected layer and ReLU.
  • Hence, the data flow of the different hidden layers of SFA is as follows: 227×227×3 > 27×27×48 > 13×13×128 > 13×13×192 > 13×13×192 > 6×6×128 > 1×1×4.
  • Caffe is used.

3. Datasets

3.1. Training Dataset

  • 200,000 128×128×3 global blur patches are used for training.
  • In brief, these patches are cropped from synthetic Gaussian blur, motion blur, and defocus blur applied on the Oxford building dataset and Caltech 101 dataset, as well as cropped from real haze blur images gathered from online websites.

3.2. Testing Dataset 1

  • Berkeley dataset 200 images and Pascal VOC 2007 dataset are selected to be the testing dataset.
  • In total 22,240 global blur test sample patches are obtained in which 5560 haze blur image patches possess the same sources with training samples.

3.3. Testing Dataset 2

  • A dataset consisting of 10,080 natural global blur image patches is constructed. The samples are all collected from the same websites as the haze blur samples in Training dataset.

4. Experimental Results

4.1. Loss Curves & Accuracy Curves

Loss Curves & Accuracy Curves of AlexNet and SFA

4.2. Comparison with AlexNet

Comparison with AlexNet
  • L_N: The model depth.
  • F_T: The forward propagation time of the single image.
  • B_T: The error backward propagation time of a single image.
  • CLF_T: The time of identify a single image.
  • Tr_T: The model training time.
  • Error: The classification error rate over the testing dataset1.

4.3. SOTA Comparison

  • Accuracy1 is test on the testing dataset1 and Accuracy2 is test on the testing dataset2.
  • The prediction accuracy (>90%) of learned feature-based methods is generally superior to the ones (<90%) whose features are handcrafted.
  • The classification accuracy of SFA on simulated testing dataset is 96.99%, which is slightly lower than AlexNet of 97.74%, nevertheless, it is still better than DNN model of 95.2%.
  • In addition, the best performance of SFA on natural fuzzy datasets is 93.75%, slightly lower than that of 94.10%, however, the speediness and real-time performance of SFA is significantly better than AlexNet.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store