Handwritten Digit Prediction using Convolutional Neural Networks in TensorFlow with Keras and Live Example using TensorFlow.js

Ashok Tankala
Coinmonks
7 min readMay 26, 2018

--

Whenever we start learning a new programming language we always start with Hello World Program. Likewise, most AI/ML developers say “Just like programming has Hello World, machine learning has MNIST”.

Like everyone, I wanted to start from there. In fact, I wanted to write my first article/story related ML on MNIST but that didn’t sound exciting because the internet has loads of MNIST articles. I want my article/story different from others so I thought with code why can’t I share a live example also?

Let’s get started. I hope you have TensorFlow, Keras in your system if not please read my previous article. It has instructions about how to install them.

First, Lets import all necessary libraries required.

Next, let’s load the MNIST data provided by Keras

The datasets(training & test) are 3D arrays. Training dataset shape is (60000, 28, 28) & Testing dataset shape is (10000, 28, 28).

The input shape that CNN expects is a 4D array (batch, height, width, channels). Channels signify whether the image is grayscale or colored. In our case, we are using grayscale images so we give 1 for channels if these are colored images we give 3(RGB). Below code for reshaping our inputs.

It’s always good to normalize data. Our Datasets will have data in each pixel in between 0–255 so now we scale it to 0–1 using below code.

Our output ranges between 0–9. So, its a multi-class classification problem. All values(output) are equal to us so it’s better to use one-hot encoding. One-hot encoding transforms integer to a binary matrix where the array contains only one ‘1’ and the rest elements are ‘0’.

For example, we are expecting output as 8 means value of output variable 8 so according to one-hot coding its [0,0,0,0,0,0,0,0,1,0]

Now let’s build model

Let’s understand above code step by step.

  1. The first hidden layer is a convolutional layer called a Convolution2D. The layer has 32 filters/output channels, which with the size of 5×5 and an activation function. This is the input layer, expecting images with the structure outlined above (height, width, channels).
  2. The Second layer is the MaxPooling layer. MaxPooling layer is used to down-sample the input to enable the model to make assumptions about the features so as to reduce over-fitting. It also reduces the number of parameters to learn, reducing the training time.
  3. One more hidden layer with 32 filters/output channels with the size of 3×3 and an activation function.
  4. One more MaxPooling layer.
  5. The next layer is a regularization layer using dropout called Dropout. It is configured to randomly exclude 20% of neurons in the layer in order to reduce overfitting.
  6. Next layer converts the 2D matrix data to a vector called Flatten. It allows the output to be processed by standard fully connected layers.
  7. Next layer is a fully connected layer with 128 neurons.
  8. Next(last) layer is output layer with 10 neurons(number of output classes) and it uses softmax activation function. Each neuron will give the probability of that class. It’s a multi-class classification that’s why softmax activation function if it was a binary classification we use sigmoid activation function.

Let’s compile the model. I used categorical_crossentropy as a loss function because its a multi-class classification problem. I used Adam as Optimizer to make sure our weights optimized properly. I used accuracy as metrics to improve the performance of our neural network.

It’s time for our model training. The model is going to fit over 10 epochs and updates after every 200 images training. The test data is used as the validation dataset, allowing you to see the skill of the model as it trains.

I want to test my trained model with my own images so I want to store my model on my local hard disk.

The test dataset is used to evaluate the model and after evaluation Test loss & Test Accuracy metrics will be printed.

I got around 99.19% accuracy. You will find this example code with name mnistCNN.py at my GitHub repository.

After completing this I didn’t get satisfaction because it ran on the data provided by Keras. I want to verify my trained model on my own data. So I created a couple of images by myself & stored the images in my data folder and then checked with my model. Results looked decent. Code for this

You will find above code, images & model file at at my GitHub repository. To run above code you need Pillow Package. You need to run below command to get the package.

But still, I am not satisfied so I thought let’s do something more. We all know Google introduced TensorFlow.js. I read that we can use our existing model also. So I thought why not build a small page for this example. From here journey became more excited.

First, we need canvas where the user can draw a number. For this, I wrote an HTML with the help of this article.

Now we want our model to be used at browser level for that we need to convert into the format by which TensorFlow.js can consume. For this task, this article helped me. To convert Keras model to TensorFlow js consumable model we need tensorflowjs_converter. For this we need to install tensorflowjs package.

I used below command to convert the format

Now a model file & a couple of supporting files for the model will be created at models folder. With these(model.json, group1-shard1of1, group2-shard1of1, group3-shard1of1, group4-shard1of1) names. These are going to help us to use our Trained DL(Deep Learning) model.

Now I am going to reveal our secret ingredient for this story

I am going to explain 3 important things here rest all are fairly straightforward. It all starts with TensorFlow.js script include. Need to include TensorFlow.js for that add below line to your HTML file.

Next our init function. 2 lines are important in this init function.

1. I used the async function because I want to make sure model is loaded before using the example. that’s why await used at the time of loading model.

2. Load the model. Use below code for this.

Next most important one, our Predict function.

Let’s understand above code step by step.

  1. First, we are extracting the grayscale image from the canvas.
  2. Then converting that image to tensor(Array)
  3. We want 28*28 array(image) so we are resizing the array
  4. We want data to be in the float32 format so we are type casting data to the float32 format
  5. We need [1, 28, 28, 1] shape for Model because it expects (batch, height, width, channels)
  6. We need to normalize the data so we divided data with 255.
  7. Then trying to predict the number

You will find this code at my another GitHub repository.

You can see its live example here. It’s not perfect but performs decently.

Peace. Happy Coding.
See my original article here.

Join Coinmonks Telegram Channel and Youtube Channel get daily Crypto News

Also, Read

--

--

Ashok Tankala
Coinmonks

I help aspiring & emerging leaders gain clarity & reach their potential so they can build a fulfilling life both personally and professionally - http://tanka.la