First Dip into Convolutional Neural Networks

David Angeles
CUNY CSI MTH513
Published in
5 min readMay 16, 2019

I didn’t know much about machine learning. The most I knew about machine learning was from Youtube videos of people creating programs to be the best in one video game video games. Learning of the uses and large amount mathematics that go into machine learning was somewhat over whelming. But I had a relatively new ally who was more than willing to delve into the world of machine learning with me.

Competition: Fashion MNIST Recognize clothing items from their pictures

This is our third completion of the semester. Fashion-MNIST is a data set consisting of 28x28 gray-scale image, 784 pixel per image. Each with an associated label from 10 categories. A training set of 60,000 examples and a test set of 10,000 examples. Our task was to identify which category the article of clothing belonged to.

Model selection:

The competition is a multi-classification problem. First, we choose SVM, since it was the newest model we learned about. So, we went about making a simple basis model with the intent to use grid search in order to improve it as we progresses. Due to lack of efficiency and our own lack of understanding, SVM was not optimal for this competitor.

Other model did seem like they would be a good choice either, mostly due to the data we where dealing with. This was the first completion where our data have been images, and I was perplexed on how to handle the data. Then we where told of a new model Neural Networks.

CNN: (Convolutional Neural Networks)

A neural network is modeled loosely after the human brain, that are designed to recognize patterns. With nodes are specialized to do a singular task, are connected to another node creating a network. Were the nodes will take in an input perform a function to the inputs and usually send the outputs to another node. A convolutional neural network (CNN, or ConvNet) is a class of neural networks, most commonly applied to analyzing visual imagery. Using

- Pros

The more data the better the model preforms. Compared to other models CNN doesn’t plateau as fast as other models, once more data is used.

Large freedom as to how your Neuro Networks (NN) is composed.

Can be express as a linear function within a nonlinear function within a linear function and so on and so forth.

- Cons

Computation time

For this competition, lack of data.

Our Model

First Model

Our first attempt at a CNN model was a simple one. A quick reshape of our data into a 28 X 28 matrix two convolution layers with activation function RELU (Rectified Linear Unit), a flatten layer and a dense layer with activation function SOFTMAX. With this model we achieve the score of 0.85266%

The next step we took was to look back at our data and after a week of looking at articles, two things were always mentioned; more data and normalization of data. While coming up with new data seemed out of my expertise the next thing to do was to normalize my data to help speed up the process of my machine and to make the cost function symmetric.

Once that was done, I need to arm my self with some tools that would help me create a CNN. So, I got drop out layers, constitutional layers, pooling layers and batch normalization layers. Taking inspiration from LeNet-5 I tried modeling my CNN after LeNet-5, for it was a compact and used very few layers. Creating cycles of convolution -> pooling -> dropout -> batch normalization, and re-positioning the layers of the cycle. The next big step was looking at optimizes with “Adam” being the popular choice among many articles I was finally able to meet the previous score of 85%. However that result was the issue, after so many hours and hard work put in i ended up where i started.

A Mistake

Clearly complexity of the model was not the determining factor, since our first model out performed any model we had come up with thus far. We took a step back to scraped the new model went back to the first model. we applied the normalization of the training data set and used the Adam optimizer, and moved on from there.

Adding more data was our next choice, so we use an ImageDataGenerator to create generate more data. By passing our data set through the generator, it will apply transformations to our data and add it to our training data set. transformations such as Rotation angle in degrees, Shift in the x direction, Shift in the y direction, Shear angle in degrees, Zoom in the x direction.
,Zoom in the y direction, Horizontal flip, and Vertical flip.Hence we get more data from our limited data set. In addition we started looking at epoch specifically at epoch vs accuracy of train and validation. the final accuracy of our model was 0.91200%

Final Model
Epoch vs Accuracy

An interesting thing to look at was the images the model was getting wrong.

First 20

These are first 20 that my model got wrong. An interesting image to look at is the upper right image. One explanation as to why my model categorized the image incorrectly, is design choice of the article. In other words the limitation of our data, since our data was in grey scale and not color. This means that there is a lost of information that from the image taken and the image we were presented and thus the model was limited on its predictions. were color might have given us those key detail, that could have made the difference.

In conclusion my model wasn't first in the competition and it still need improvements. Experimentation and failure help to move and change the design of the model. Along with the discovery of new tools. For not knowing what the world of machine learning was , its not impossible and anyone can enter the world of machine learning.

--

--