A Convolutional Neural Network-based Digit Recognition Web App using Python and Javascript — Part I

FRANTO SHAJEL
Data Science v2.0
Published in
6 min readNov 30, 2019

--

This is the first one in my Project Series. Going by the quote “With practice comes mastery”, I have decided to practice the machine learning and deep learning concepts that I learn or come across and make it into a series of projects/posts so that my fellow machine learning and deep learning enthusiasts can either learn or correct me (mutual learning).

This article is the first part of my two-part article for the Digit Recognition Project. In this part, we will be building the Convolutional Neural Network for classifying the hand-written digits from scratch using Keras and Tensorflow. In the next part, we will be developing our web app using HTML and Javascript to deploy our model.

Digit Recognition Web App Demo

The source code for the entire project is available here and the demo can be accessed here.

Importing the Dataset:

First, we import the TensorFlow library and load the MNIST digit dataset. The MNIST dataset contains 60,000 training images and 10,000 testing images. While loading itself, we are making it into four sections: (x_train, y_train), (x_test, y_test). The train categories containing 60,000 images will be used for training our CNN model and the test categories containing the 10,000 images will be used for testing our model. The x categories hold the images while the y categories hold the labels associated with them.

import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

Next, we check the contents of the dataset. For this, we import matplotlib.pyplot library which is a collection of command style functions that make matplotlib work like MATLAB. Each function in pyplot makes some changes to a figure (for example, creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.). The imshow() function in pyplot is used to display an image. Here since the index 44444 has been selected it displays the image of that index which is ‘6’. The index may vary from 1 to 60,000.

import matplotlib.pyplot as plt
image_index = 44444
print(y_train[image_index])
plt.imshow(x_train[image_index], cmap='Greys')
#This shows an image of '6'

Note: We are not importing all the libraries at first. We are importing them at respective places to have a better understanding of what is happening in our code.

Preprocessing:

Next, y_train and y_test should be converted into categorical formats using the to_categorical function from keras.utils; this converts the labels into vectors so that it can be handled by our model (for example, the label 2 is converted into [0,0,1,0,0,0,0,0,0,0]).

from keras.utils import to_categorical
y_train = to_categorical(y_train, num_classes = 10, dtype = 'float32')
y_test = to_categorical(y_test, num_classes = 10, dtype = 'float32')

Then, some preprocessing activities are carried out on the images (x_train and x_test). Their shapes are changed to include an RGB channel. Here the value of RGB is 1 since it is a black & white image. Then, their array values are converted to float so that we can get decimal values after division and the RGB codes are normalized by dividing it to the maximum RGB value ‘255’.

#Reshaping
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
#Converting to float
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
#Normalizing
x_train /= 255
x_test /= 255

Building our Model:

Then the Convolutional Neural Network is built from scratch using Tensorflow (as below). The Sequential model is a linear stack of different layers like Conv2d, MaxPooling, BatchNorm, Dropout, Dense, etc. Conv2d is a mathematical operation that summarizes a tensor or a matrix into a smaller one by convolving filters across the image. Stride is the number of pixels with which the filter slides horizontally or vertically.

MaxPooling summarizes the presence of features in an input image by downsampling the input representation i.e., it reduces the dimensionality of the image for assumptions to be made about features contained in the sub-regions. BatchNormalization normalizes the inputs of each layer in such a way that they have a mean output activation of zero and a standard deviation of one which in turn allows much higher learning rates thereby increasing the speed at which networks train.

Dropout is a regularization approach in neural networks that helps to reduce interdependent learning amongst the neurons by ignoring units (neurons) during the training phase of a certain set of neurons which is chosen at random. Rectified Linear Unit (ReLU) is a non-linear activation function whose main purpose is to convert an input signal of a node into an output signal (either 0 or +ve) which is then used as input in the next layer in the stack. This makes it easy for the model to generalize or adapt to a variety of data and to differentiate between the output.

Flatten reshapes the tensor to have the shape that is equal to the number of elements contained in tensor not including the batch dimension i.e, it flattens the output of the convolutional layers to create a single long feature vector.
The Dense layer feeds all outputs from the previous layer to all its neurons, each neuron providing one output to the next layer. Softmax assigns decimal probabilities to each class in a multi-class problem and these decimal probabilities must add up to 1. The one with the maximum decimal probability is the predicted output.

from tensorflow import kerasinput_shape = (28, 28, 1)model = keras.Sequential([keras.layers.Conv2D(32, kernel_size = 3, activation='relu', input_shape = input_shape),
keras.layers.MaxPooling2D(),
keras.layers.Conv2D(32, kernel_size = 3, activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(32, kernel_size = 5, strides=2, padding='same', activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Dropout(0.4),
keras.layers.Conv2D(64, kernel_size = 3, activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(64, kernel_size = 3, activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(64, kernel_size = 5, strides=2, padding='same', activation='relu'),
keras.layers.BatchNormalization(),
keras.layers.Dropout(0.4),
keras.layers.Flatten(),
keras.layers.Dropout(0.4),
keras.layers.Dense(10, activation='softmax')
])print(model.summary())

Training our Model:

Then we train our model using the compile function. The loss function is a method of evaluating how well our algorithm models our dataset. It shows a high loss value when the prediction deviates too much. Then with the help of optimization function, it learns to reduce the error in prediction. Here Adam is used as our optimizer as it leverages the power of adaptive learning rates methods to find individual learning rates for each parameter. Metric is a function that is used to judge the performance of the model and here accuracy is chosen as the metrics. Then we train the model for 30 epochs i.e, the model is trained on the entire 60,000 images with their labels for 30 iterations.

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.fit(x=x_train,y=y_train, epochs=30)

Evaluating our Model:

Then we test our model on the 10,000 test images using the evaluate function. This shows the accuracy of our model and here we get an accuracy of 99.5% which is more than satisfactory.

model.evaluate(x_test, y_test)

Saving our Model:

Everything is done and our model has been developed, trained and tested successfully. To use it in an external application (web app in this case), we need to save our model in the required format to be used in our application. Since we are developing our web app using JavaScript, we need a JSON file and hence we are saving our model in that format. To do so, we need to import the tensorflowjs library. After importing, just save the model as below. Since I am using Google Colab, I have given the destination as ‘/contents/models’ which saves the JSON and weight files in the model folder created in my Colab Notebook. It may be whichever destination you wish if you are working on your local machine.

import tensorflowjs as tfjs
tfjs.converters.save_keras_model(model, '/content/models')

That’s it…!!! We have successfully built our CNN model and saved it as a JSON file. In the next part, we will be learning how to build a web application for this model using HTML, CSS, and JavaScript.

Source Code: Github Repo

Demo: Heroku App

Link to the next part.

Expecting your feedback…!!!

Thank you…!!!

--

--