Image Inpainting with Deep Learning

Published in

JamieAi

4 min readOct 8, 2018

Machines are capable of hallucinating. This capability of the machines can help us in developing techniques such as Image Inpainting. For more information on how they do it: give this article a read.

What is Image Inpainting?

Inpainting refers to the art of restoring lost parts of an image and reconstructing them based on the background information. It refers to the process of filling in missing data in a designated region of visual input. In the digital world, it refers to the application of sophisticated algorithms to replace lost or corrupted parts of image data.

Image inpainting has been widely investigated in the applications of digital effect image restoration, image coding and transmission.

Given the above image, how can we fill the missing information? Imagine we are building a system to fill in missing pieces. How can a system do it? How will the human brain do it? What kind of information do we use? These are the questions we should think of, to solve this problem.

There are two types of information to be focused on

Contextual information
Perceptual information

Traditional Inpainting

Traditionally, image inpainting is addressed either using diffusion-based approaches that propagate local structures into the unknown parts, or examplar-based approaches that construct the missing part’s one pixel (or patch) at a time, while maintaining the consistency with the neighbourhood pixels.

These approaches fail when the size of the missing part is large, hence an additional component providing plausible imaginations (hallucinations by machine) is needed. This additional information might be provided by high-order models of natural images, such as those computed by deep neural networks.

Implementation using Deep Neural Networks

In this approach, we rely on the hallucinations of pre-trained neural networks to fill large holes in images. Deep neural networks use supervised image classification. In supervised image classification, each image has a specific label, and neural networks are learned to approximate the image-label mapping through a cascade of elementary operations. When trained on huge training datasets (millions of images with thousands of labels), deep networks have remarkable classification performance that can occasionally surpass the human accuracy. A Discriminative pre-trained neural network is implemented to guide the image reconstruction where directly the last layer of the deep network is used in the image inpainting problem.

Discriminative pre-trained neural network

Maximization Problem

Let us consider a maximization

max Iˆ Nl( ˆI) subject to ˆIΩ = IΩ

N — trained neural network

I — image with missing /deteriorated parts

I — be the unknown image to recover

Ω the subset of R2 (R squared, contains known part of the image)

Our goal is to recover the pixels in the complement of Ω (denoted by Ωc ). The above problem reconstructs the missing part Ωc using the prior knowledge of the classifier, which has potentially seen millions of images during the training phase.

Regularization strategy

Total Variation (TV) norm is one of the strategies with the goal of removing undesirable details, while still preserving important details such as edges. The TV norm has been extensively used as a regularizer in several inverse problems, such as denoising and super-resolution due to its edge-preserving properties.

Comparison of different techniques

An original image is masked deliberately to check performance.
Diffusion results in losing edges.
[5] is an examplar approach, it does not manage to reconstruct the corrupted images efficiently.
Deep learned neural network correctly completes the shape of the image. Combination of deep network hallucination and regularization results in efficient image restoration.