week5: m0d3l.cpkt #Learning Based Image Colorization

--

Theme: Learning Based Image Colorization

Team Members: Bugrahan Akbulut, Gökay Atay

Gray-scale and Colorized Image

REMINDER

Image colorization assigns colors to a gray-scale image, which is an important yet difficult image processing task encountered in various applications. Colorization of gray-scale image without any user interaction is nearly impossible. It requires some preliminary information about image such as background color, objects in that image, texture information, edges, lines etc. For image colorization problem, there have been three different methods that can be considered in the past years. These three methods are scribble-based colorization, example-based colorization and learning-based colorization. One of the earliest approaches for image colorization problem is scribble-based colorization. We think that scribble-based colorization is the approach that requires the most human work. In scribble-based colorization, artists paints small colored lines on the gray-scale images and then these images are used to colorize gray-scale image. Example-based colorization is also used widely in different studies. The main idea of this approach is, using a reference image, algorithm colorizes pixels with same color which has similar characteristics with the reference image. As we discussed on the previous weeks’ posts, we are going to colorize an image using learning-based approach.

Original and Colorized Image by Deep Convolutional Neural Network

In the most of scene images, sky is blue or grass is green, not always but most of the time. Without user interaction we can not gather these kind of preliminary information unless we don’t use deep learning. Today, for the most visual tasks deep learning models work almost faultless, even for some cases they might work better than humans. By training our model with some subset of ImageNet dataset we aim to overcome these problems.

Recent works have shown us Convolutional Neural Networks (CNN) are the proper solution for many visual tasks, including image colorization. In this post, we will try to explain our dataset, model, training and testing stages of our model.

ABOUT DATASET

Currently, there are 14,197,122 images, 21841 synsets indexed on ImageNet. Since training the model on complete ImageNet dataset will be painful in terms of time, we have decided to train/test our model with a small subset of dog images from ImageNet dataset.

We have created a text file that contains URLs of some dog images. We have coded a script which iterates through each image URL, and downloads the image to generate our dataset.

Example Images from the Dataset

MODEL

As we mentioned on the previous week’s post, we have decided to create our model using DenseNet architecture. Our model contains two different parts like Automatic Image Colorization [2].

We have created first part of our model using DenseNet121 [1]. We have directly copied DenseNet121’s first 52 layers and added into our model. We have used pre-trained layer weights of DenseNet. Using pre-trained model and lower levels of DenseNet provides an advantage: We do not have to reinvent the wheel. Since bottom of deep models encodes low level information such as edges, lines, textures, by copying these layers and their weights’ we are reducing the needed train time.

Image result for densenet
DenseNet Architecture from Original DenseNet Paper

In second layer we have applied deconvolution operation (convolution blocks and upsampling to get increased resolution for features). Final output shape is (224, 224, 2) for each image and it contains a and b color channels.

TRAINING THE MODEL

We have approached to the problem as a regression problem. There are output color values to predict by using gray-scale version of an image. In a simpler manner we have mapped lightness values to ab values. We have chosen CIE lab color space to perform training. Since CIE lab is a perceptually uniform space, this means that, in any color pair’s difference in CIE lab color is proportional with Euclidean Distance. Also using CIE lab instead of RGB reduces number of parameters to predict (our model maps gray to a and b channels instead of r, g and b channels). We have decided to use mean square error for this regression problem just like Automatic Image Colorization [2]. We have used Adam Optimizer to optimize our model’s weights, batch size as 32 and epoch count as 15. We have used small epoch number to be sure and visualize if our model was really learning or not. For the final result, we will train again with larger epoch count (next week).

Here is the loss graph.

Loss Plot

As we can see from the graph, loss decreases over time and we can say that our model learns.

TESTING THE MODEL

After training the model, we have tested our model with some images which is not in the train dataset to see if the results was plausible.

Some colorized test samples are shown below.

Colorized Test Samples (After 15 Epoch Training)

--

--