AWS Project 1 — CNN

4 min readMay 11, 2018

In this project I am attempting to create a Convolutional Neural Network (CNN) to classify images from Keras’ CIFAR10 dataset. This dataset contains 50,000 32x32 color images to train on and 10,000 32x32 color images to test on with the images labeled over 10 categories (hence the 10 in the dataset name). I end with a glimpse of what lies ahead in Project 3 by visualizing how this model learns. Stay tuned!

Model building

Conv2D’ tells us the size of the convolution filter we want. This should correspond to the input size.
The ‘padding’ term in the layer allows the convolution filter to include every single row and column by creating imaginary rows and columns that extend beyond the image.
The ‘relu’ activation was used because it performed better than ‘tanh’ and ‘softmax’. I also normalized the activations of the previous layer.
‘MaxPooling’ is a pooling layer than in essence downsamples our outputs. It takes the maximum of every subregion of the outputs from the previous layer.
‘Dropout’ is a regularization technique that drops some of the layers so as to not overfit the model to the training data. This is very useful in neural networks that can easily give 100% training accuracies and poorer validation and test accuracies.
‘Flatten’ is responsible of converting the output of the CNN into a 1D feature vector that can then be passed into a ‘Dense’ layer. The ‘Dense’ layer is then responsible for the final classification task.

Model compilation and fit

The compile method is used to configure the learning process for the CNN. It usually takes the ‘optimizer’, ‘loss’, and ‘metrics’ parameters.

I used ‘rmsprop’ for my optimizer as it performs better than ‘adam’ and ‘sgd’.
Since my classification task is a multi-class classification, I wanted my loss function to minimize the loss of information between classes, represented by ‘categorical_crossentropy’.
I used ‘accuracy’ as my metric as I wanted to focus on the accuracy of my model.

The fit method takes in the train images, labels, epochs, and batch size amongst other parameters.

‘Epochs’ is the number of iterations of our model, which I have set to be 5. This is not too many for a CNN but is sufficient for our model.
The ‘Batch size’ specifies the number of samples per update.

Model accuracy

As can be seen there is a constant increase in the accuracy of the network and decrease in the loss. There are 5 points in the graphs because we ran 5 epochs.

The test accuracy from the model is 78.08%, which is not that good for a CNN. However, guessing at random would give you an expected accuracy of 10% (since there are 10 classes). Hence, this model seems to perform well.

Model visualization

I wanted to see how this model is learning. What is each layer in this convoluted CNN (no pun intended) doing? In order to see this, I decided to visualize each layer on an image of a dog that will be used in Project 3. The only thing I changed in the model was the input shape, which I increased to (900x900x3). The model could now take dog images of size (900x900) as compared to the CIFAR10 images of size (32x32).

This was the image I provided it.

Disclaimer : If you have not seen how a CNN learns before, be warned it is quite trippy.

The visualization of how each layer learns this image is really interesting and tells us a lot about which layers might be important for what aspect of the picture.

The first layer seems to be responsible for detecting the edges of the dog and background. The learning becomes more abstract as we go down the layers and they become responsible for more specific detectors.

This CNN does not have too many layers, and so we do not see the highly specific detections such as dog eye or dog ear but had we added those layers we would have obtained highly pixelated visualizations representing these spare specific detections.

In order to see all the code, please visit my Github

Click here for the next project on RNN, and here to see what I did with all the dog images I had.

AWS Project 1 — CNN

Model building

Model compilation and fit

Model accuracy

Model visualization

Written by Samarth Goenka