Deblur Photos Using Generic Pix2Pix

A Naive Approach with Minimal Domain Knowledge

Motivation / Background

Last week my partner came across a problem at work. There were some poorly shot photos that were quite blurry and needed to be repaired. Unsharp masking didn’t work well, along with a few free reparing softwares. The problem was solved by manually recreate important parts of the photo using Photoshop. But I couldn’t help but wonder if deblurring can be done via some generic deep learning algorithms.

I started with some super resolution algorithms, but soon realized that there are some differences. De-blurring, in essence, is trying to reverse convolution on an image (blind decovolution). Super-resolution, on the other hand, is trying to reverse the down-sampling on an image. Therefore I found pix2pix model should be more adequate for this task (as paired mappings between blurry photos and clear photos)[1]

(Warning: Restoring degrading images is a topic that has been actively researched, and on which I’m mostly ignorant. This post is about me throwing a generic algorithm at it. Please check the reference section for some examples of the research [2][3][4])

Implementation Overview

Explanation to Adversarial Training from [1]

The code is based on pix2pix implementation by mrzhu-cool on Github, with the following modifications.

  • The original GAN loss is replaced by Wasserstein loss (using a similar structure as in martinarjovsky/WassersteinGAN)[5].
  • The image is normalized as in torchvision.models.
  • The Tanh activation in the last layer of the generator is removed.
  • Before adversarial training, train generator with L1-loss to the original image for 1 epoch.

The MIRFLICKR-25k dataset is used (in hope of generalizing better with real-life photos). The first 20k photos is used in training. Scaling and random cropping is applied. For the last 5k photos, around 2k are used as validation/development set, and the rest is reserved as test set (not used yet).

The artificial blurring is created by applying an uniform 3x3 filter and an Gaussian 5x5 filter (there’s a lot of rooms to be improved):

Left: Uniform Filter used by Blur(3); Right: Gaussian Filter used by GaussianBlur(5, 1.5)

Results — Uniform Filter

Achieved ~19.5 PSNR in development set with 200 epochs (batch size 16).

Samples from training dataset. Left: Blurred (Input); Middle: Recovered (Output); Right: Ground Truth
Real Photo Examples. Left: Blurred (Input); Middle: Recovered (Output)

Results — Gaussian Filter

Achieved ~19.8 PSNR in development set with 200 epochs (batch size 16).

Samples from training dataset. Left: Blurred (Input); Middle: Recovered (Output); Right: Ground Truth
Real Photo Examples. Left: Blurred (Input); Middle: Recovered (Output)

Remarks

  • The model performs reasonably well on the artificial dataset
  • However, the model converges quite slowly, considering the blurring filter used is really simple.
  • On real photos, the model has limited successes. Some parts in the result images feel unnatural. This is somewhat expected given how the training dataset was prepared (the model is unlikely to generalize well).
  • We might need to provide some negative training data where the photo is clear enough and needs no or little enhancing.

Source Code

The code released on Github. It has been tested on ceshine/cuda-pytorch:0.2.0 Docker image. Please check the accompanying Dockerfile for details.


To-Do’s

  • Create a diversified blurring filters when training to make the model more generalized.
  • Smoothly blend deblurred patches as in Vooban/Smoothly-Blend-Image-Patches for larger(with higher resolution) photos.
  • Tune the weights on adversarial and content(L1) losses. (Currently using 1:5 weighting)
  • Try the improved Wasserstein GAN[6] with the latest PyTorch on master branch that supports gradients of gradients.
  • Try U-net structure for the generator. (Currently a encoder-decoder structure)
  • Evaluate the difference between using RMSprop as suggested in [5] and Adam optimizers.

References:

  1. Isola, P., Zhu, J.-Y., Zhou, T., &Efros, A. A. (n.d.). Image-to-Image Translation with Conditional Adversarial Networks.
  2. Image restoration with Convolutional Neural Networks
  3. Schuler, C. J., Hirsch, M., Harmeling, S., &Schölkopf, B. (n.d.). Learning to Deblur
  4. Yan, Q., &Wang, W. (n.d.). DCGANs for image super-resolution, denoising and debluring
  5. Arjovsky, M., Chintala, S., &Bottou, L. (2017). Wasserstein GAN
  6. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., &Courville, A. (2017). Improved Training of Wasserstein GANs