Day 52 of 100DaysofML
CIFAR-10 Photo Classification Part-2. In the last blog I covered about the overall gist of the given dataset and the required preprocessing that needed to be done along with importing all the required libraries and dataset along with scaling of the individual messages.
In this blog, we shall be building the neural network for the following problem.
I would recommend copying the code from the previous blog before you start with this implementation.
Let us start by importing the library for the model which is termed as Sequential in Keras.
from keras.models import Sequential
model=Sequential()
Our next main step is going to be to add the layers to our neural network.
from keras.models import Sequential
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
model=Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(32, 32, 3)))
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
model.add(MaxPooling2D((2, 2)))
This concludes the feature detection part of the model. We first import the layers that we are adding to our neural network. Conv2D is the 2 dimensional Convolutional Layer that we are adding to our neural network. Max Pooling layers and Flattening layers are used by CNN in their working and I would suggest reading my previous blogs about it. The activation function is set to be ‘relu; which stands for rectified linear unit which is used by convolutional layers in case of classification problems. We have added a Max Pooling layer after every 2 Conv2D layers as you may notice.
Maximum pooling, or max pooling, is a pooling operation that calculates the maximum, or largest, value in each patch of each feature map. The results are down sampled or pooled feature maps that highlight the most present feature in the patch, not the average presence of the feature in the case of average pooling.
This concludes the portion whereby we create our neural network and add all the layers to it.
The next main step would be to compile and optimize the model that we are creating. The Optimizer that I shall be using is the commonly used SGD or Stochastic Gradient Descent algorithm which I shall be importing from keras again.
from keras.optimizers import SGD
model.add(Flatten())
model.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
Here, we have used a Dense layer along with our softmax activation function in order to help with the classification part of the model that interprets the features and makes a prediction as to which class a given photo belongs.
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
We will use a modest learning rate of 0.001 and a large momentum of 0.9, both of which are good general starting points. The model will optimize the categorical cross entry loss function. We have defined the metrics as accuracy so that we can see the epochs of training as we go through our entire training process.
The next step is to fit the data to the model. It is better to use a separate validation dataset, e.g. by splitting the train dataset into train and validation sets. We will not split the data in this case, and instead use the test dataset as a validation dataset to keep the example simple. The test dataset can be used like a validation dataset and evaluated at the end of each training epoch. This will result in a trace of model evaluation scores on the train and test dataset each epoch that can be plotted later.
history = model.fit(trainX, trainY, epochs=100, batch_size=64, validation_data=(testX, testY), verbose=0)
print('> %.3f' % (acc * 100.0))
summarize_diagnostics(history)
We obtained an overall accuracy of around 70 percent and it can be further improved by playing around with parameters of training or preprocessing of the model.
That’s it for today. Keep Learning.
Cheers.