Deep Learning Class Project Journal — Day2

4 min readApr 30, 2017

Last time, I showed my first results fixing corrupted images using undercomplete autoencoders.

I also tried overcomplete autoencoders, but I could not add many units to my layers without an acceptable performance cost. In fact, my GPU memory was rapidly full, and working configurations would take forever to run. Also, the results were awful.

Instead of wasting more time on fully connected networks, I decided to take a chance with convolution networks.

Convolution autoencoders

Not surprisingly, pixels that are nearby in space are more correlated than those further apart. Knowing this, you would expect a neural network to be more accurate on pixels closer to the missing middle’s borders. Fully connected networks are not that smart and do not take these connections into account.

A convolution network allows you to add this key aspect with one simple trick : Shared parameters across space.

Notions of undercomplete and overcomplete are not as useful in the convolutional context, but we still want to keep the idea of compacting the useful information, this is done with downsampling.

My first model

There are many ways to do downsampling, I chose convolution+maxpooling instead of striding because striding is more aggressive. You may lose a lot of information to downsample and for this task, capturing good features from the border is important.

Layers : 7 Conv, 2 Maxpooling, 2 Upsampling
Activations : ReLu
Optimization : Adam
Loss : binary crossentropy
Input : Full image without middle
Target : Full image

Since last time, I also added a couple of useful features to my code so I can understand what is going on. Below, you can see a summary of the model (just described) I trained for a 100 epochs (9h58):

Training loss is slowly converging and I was surprised to see validation loss following this curve. I decided to double the convolution layers to add more capacity to my network. I wanted to see if I could make my network overfit to training set. The loss curves were similar, slightly better, and the results as well.

The images are blurry but there is a clear difference with the results obtained with FC networks. Keep in mind this is just a sample, these are not the best or the worst predictions.

I then tried to add techniques to improve these results. The first one was batch normalization. An advantage of BN is faster convergence without overfitting. I added BN for each conv layer. Loss did converge faster in terms of iterations, but each epoch took significantly longer to run. I am not sure why, maybe because of Theano backend. Another problem was the quality of the images, they were all darker.

I also tried Dropout for fun, even though the network was not overfitting. The result was the opposite of what I expected, validation loss increased and the images were darker as well. I think this is mainly because it is dropping strong links (from the border) randomly and weakening the network instead of making it stronger.

Hours and hours of computations…

I tried many other variants, changing the droping rate, learning rate, leaky ReLU, different number of filters … Hours and hours of computations, but nothing significantly better than the results already shown. There’s probably a lot more I could have done with deep convolution autoencoders, but I feel I need more reading/experience on DL at this point.

You can track all these changes on Github.

Conclusion

Convolution networks did not disappoint in their capacity of generating more “believable” images, but the results are still not satisfying. In the beginning, I assumed I could treat these corrupted inputs as images with a lot of noise, and it justified the use of autoencoders. But the missing part is too big, it seems all the networks I have been working on are lacking “imagination”.

Stay tuned for more.

Deep Learning Class Project Journal — Day2

Convolution autoencoders

My first model

Conclusion

Written by Patrick Mesana