CNN for MNIST Handwritten dataset

Yashwant Kundathil
4 min readMay 10, 2020

--

Achitecture of ConvNet

Convolutional Neural Network (CNN) or simply ConvNet is one of the most popular algorithm for image recognition and classification. A common ConvNet is a combination of Convolutional Layer, Pooling and Dense Layer. Here in this article we will be discussing how to use CNN on the MNIST handwritten dataset.

MNIST Dataset

Now the MNIST dataset is a basic dataset that consists of 60000 28x28 grayscale images of handwritten single digits from 0 to 9. Although the problem has effectively been solved, it is still used as a basis for learning and practicing how to develop, evaluate and use CNN for image classification from scratch.

Problem Description

Our task here is to classify a given image of a handwritten digit into one of 10 classes representing integer values from 0 to 9, inclusively.

Loading the Dataset

from keras.datasets import mnist
(X_train, y_train),(X_test, y_test) = mnist.load_data()

Now if you were to manually print X_train[0] you will basically get an ndarray of values for first image. You can visualize this by using matplotlib library.

img1
import matplotlib.pyplot as plt
img1 = X_train[0]
plt.imshow(img1,cmap=”gray”)

Data Preprocessing

Now that we have got the dataset lets start with preprocessing and transform it in a way that would be convenient for our CNN model to work with.

Lets load the y_train and y_test on our notebook.

y_train
array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

As you can see we will have values ranging from 0 to 9. Now if we were to feed this model to our it will treat it like continuous values. Now, this is not exactly what we want. We want our CNN model to treat it like classes, so we will be going for one hot encoding here. Now keras already has an in built function to carry out this task.

from keras.utils.np_utils import to_categoricaly_cat_test = to_categorical(y_test,10) 
y_cat_train = to_categorical(y_train,10)
## 10 basically represents the number of classes

If you were to X_train[0] or any X_train you can see the values will range from 0 to 255. Now we will divide the X_train and X_test by 255. This is because there are (usually) 256-color values (0–255), but the framework we are using uses the unit RGB value scale 0–1.

X_train = X_train/X_train.max()
X_test = X_test/X_test.max()

One last change that we need to make is to reshape the X_train and X_test. If we were to check the shape of X_train and X_test it will appear like this :

X_test.shape
(10000, 28, 28)

We are missing the color channel part here. So we will reshape it and add the color channel part as well:

X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)

Building CNN Model

Let’s start building our CNN model for training MNIST Data. A general, CNN model consists of: Convolutional Layer, Pooling Layer and Dense Layers.

Now Convolutional Layer can be created by using Conv2D() inside which we will pass the filters, size of kernels, the shape of the input and activation function. Generally, we use Rectified Linear Unit(ReLU) for this purpose. For filters and kernel size you can basically play around with the values and try using values you feel are good. For creating pooling layer we use MaxPool2D function. Now before we pass the data to Dense Layers we need to convert our 2D data into 1D, which we carry out by using Flatten() another layer. After this you can pass the data to Dense layers which is created using Dense().

Once you have created the model you need to compile the model. While compiling the model you need to pass on the loss function, optimizer and metrics you are gonna use for evaluation.

from keras.models import Sequential
from keras.layers import Conv2D, Dense, MaxPool2D, Flatten
model = Sequential()## Convolution Layer
model.add(Conv2D(filters=32, kernel_size=(4,4), input_shape=(28,28,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2)))
## Converting from 2D -> 1D
model.add(Flatten())
## Dense Layer
model.add(Dense(128, activation="relu"))
model.add(Dense(10, activation="softmax"))
## Compiling The Model
model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])

Fitting the dataset into the model

This is a simple step, take the X_train and y_cat_train and pass it to the model.fit(). For a quick run lets set epochs = 2.

model.fit(X_train, y_cat_train, epochs=2)
Epoch 1/2
- 38s - loss: 0.1329 - accuracy: 0.9604
Epoch 2/2
- 38s - loss: 0.0484 - accuracy: 0.9858
<keras.callbacks.callbacks.History at 0x187119bb588>

We can see that we are getting a very good accuracy (0.98) in just 2 epochs.

Evaluating the Model

Now we have an inbuilt function called evaluate which shows the accuracy.

model.evaluate(X_test, y_cat_test)
10000/10000 [==============================] - 2s 165us/step[0.03947468514149077, 0.9872000217437744]

But lets say we need a proper report involving recall, F1 score and precision. For this we will use scikit learn package which provides a method classification report.

First, we need to get the prediction value which we will generate using model.predict_classes() and pass all the X_test values.

Now, using the classification_report() we will pass actual values (y_test) and predicted values (y_pred) and store the value in report variable.

Finally we print the report.

from sklearn.metrics import classification_reporty_pred = model.predict_classes(X_test)
report = classification_report(y_test, y_pred)
print(report)

This Model generates an accuracy of 0.98 which is really good.

--

--

Yashwant Kundathil
0 Followers

Tech Enthusiast | Student @ SRM IST, KTR Campus