How to deal with image resizing in Deep Learning

Adriano Dennanni
Nov 28, 2018 · 4 min read

This post continues the post published by Infosimples in 19/oct/2018: https://medium.com/infosimples/does-cnn-learns-modified-inputs-bc16ae1be498

TL;DR: The best way to deal with different sized images is to downscale them to match dimensions from the smallest image available.

If you read out last post, you know that CNNs are able to learn information from images even if its channels are flipped, over a cost in the model accuracy.

This post studies a similar problem: suppose each color channel has a different size. Which are the best ways to train an image classifier in those circunstancies?

First, let's create a simple model to serve as base for some comparisons that will be made in this article:

Layer                        Output Shape              Param #   
=================================================================
InputLayer (None, 100, 100, 3) 0
_________________________________________________________________
Conv2D (None, 100, 100, 32) 896
_________________________________________________________________
MaxPooling2D (None, 50, 50, 32) 0
_________________________________________________________________
Dropout (None, 50, 50, 32) 0
_________________________________________________________________
Conv2D (None, 50, 50, 64) 18496
_________________________________________________________________
MaxPooling2D (None, 25, 25, 64) 0
_________________________________________________________________
Dropout (None, 25, 25, 64) 0
_________________________________________________________________
Flatten (None, 40000) 0
_________________________________________________________________
Dense (None, 128) 5120128
_________________________________________________________________
Dropout (None, 128) 0
_________________________________________________________________
Dense (None, 2) 258
=================================================================

It's a simple model, able to tell dog pictures apart from non-dog pictures, with only two convolutions. After training it for 10 epochs (using complete 3-channel images, 100x100 pixels), the results are:

The maximum validation accuracy value of 77.58% will be used as reference to the next experiments in this post.

Scaling techniques

Original picture (160x160) — Nearest-neighbor interpolation — Bilinear interpolation
Bicubic interpolation — Fourier-based interpolation — Edge-directed interpolation algorithms

Each one of those images was downscaled to 40x40 and then upscaled back to 160x160, using each one of the scaling algorithms above. Although we lost a lot of the visual quality, we are still able to perceive that this is a shell picture, even if we have 1/16 of the information we had before.

And what about Neural Networks? Which upscaling algorithm is better for using? Or would we rather downscale the pictures? Let's put an end to this doubt.

Below, we have channel slices and combinations of them using different upscaling algorithms:

We can also test the following architecture, able to reduce bigger channels during training with convolutions:

Let's call this architecture “Multiresolution CNN”

The above architecture was develop with the idea that convolutions are able reduce the channels dimensions, while extracting only the most important features. You can check it in here:

After training the simple neural network presented in the beginning of this post with many upscaling techniques, we got the following accuracy rates:

Post-training results

If we take in consideration only the validation dataset accuracy, we can conclude that any upscaling technique is inferior to downscaling images to the size of the smallest one. The best thing to do in this case is to just downscale the pictures to match the smallest channel dimensions.

Neuronio

Neuronio is a Brazilian company that creates Deep Learning…

Neuronio

Neuronio is a Brazilian company that creates Deep Learning solutions and offers consulting services. At Medium, we write about machine learning and deep learning. #ai #deeplearning #machinelearning

Adriano Dennanni

Written by

Hi! I’m a Machine Learning Engineer and tech enthusiast.

Neuronio

Neuronio is a Brazilian company that creates Deep Learning solutions and offers consulting services. At Medium, we write about machine learning and deep learning. #ai #deeplearning #machinelearning

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store