A Convolutional Neural Network-based Digit Recognition Web App using Python and JavaScript — Part II

Published in

Data Science v2.0

8 min readDec 4, 2019

In the first part, we saw how to build the Convolutional Neural Network model to classify hand-written digits using the MNIST digit dataset. In this second and final part, we will be building a web application to deploy our model and test it in the real world.

To test our model (that was trained and saved as JSON in the previous part) in the real world, we need a means. Here we are using a web application to test our model in the real world scenario. To build a web app, we need to first point out the requirements of it. First, we need a drawing space where we can draw a digit. Then we need a display to show our result which is used to interpret whether our model has predicted correctly or not.

The source code for the entire project is available here and the demo can be accessed here.

Canvas:

First, to develop the drawing space, we use an HTML element called Canvas. This canvas lets the user draw or paint in the provided area. It also provides a means to use various paint attributes such as brushes, colors, etc. But since we are just developing a test model, we keep it simple. We can mention the height and width of the canvas in the canvas tag itself. The canvas tag is used as follows:

//index.html<canvas height="200" width="200"></canvas>

Initializing the Canvas:

The canvas element mentioned in HTML should be initialized in Javascript to be used as a drawing board or sketchpad. The getContext() method returns an object that provides methods and properties for drawing on the canvas. Since our model was trained on the MNIST digit dataset which has white digits on black background, we need to make the user interface in the same style for better results. Hence we make the background of our sketchpad black using the fillRect() and fillStyle properties. To make our sketchpad interactive, we need to listen to the mouse and touch events all the time. So, they are also initialized. Whenever an event is found, the corresponding functions are called using the addEventListener() function.

//main.jsvar canvas,ctx;
var mouseX,mouseY,mouseDown=0;
var touchX,touchY;function init() {     
  canvas = document.getElementById('sketchpad');    
  ctx = canvas.getContext('2d');    
  ctx.fillStyle = "black";    
  ctx.fillRect(0, 0, canvas.width, canvas.height);     
  if (ctx) {         
   canvas.addEventListener('mousedown', sketchpad_mouseDown, false);          
   canvas.addEventListener('mousemove', sketchpad_mouseMove, false);          
   window.addEventListener('mouseup', sketchpad_mouseUp, false);           
   canvas.addEventListener('touchstart', sketchpad_touchStart,   
                                                            false);
   canvas.addEventListener('touchmove', sketchpad_touchMove, false);     
  }
}

Draw Function:

To make the canvas into a working sketchpad (i.e, to enable drawing in the canvas area), we provide controls to mouse buttons using JavaScript. Also to make it mobile-friendly, we provide touch functions also. Since the drawing action is common for both the mouse and touchpad, we make it into a separate function as follows:

function draw(ctx,x,y,size, isDown) {    
    if(isDown) {      
      ctx.beginPath();      
      ctx.strokeStyle = "white";      
      ctx.lineWidth = '15';      
      ctx.lineJoin = ctx.lineCap = 'round';      
      ctx.moveTo(lastX, lastY);      
      ctx.lineTo(x,y);      
      ctx.closePath();      
      ctx.stroke();    
    }    
  lastX = x; 
  lastY = y; 
}

Here when the isDown flag is false, the sketchpad takes the current position of the pointer into its lastX and lastY variables. And when the isDown flag is true (i.e, when the mouse or touch is dragged), the sketchpad draws a line from the previous position to the current position of the pointer.

The beginPath() method is called first to inform the sketchpad that the user is about to draw. The moveTo() method tells it where to start the drawing. Then lineTo() method is used to draw a line on the sketchpad from the starting point to the current position of the pointer. Once the line is drawn, the closePath() method is called to inform the sketchpad that the drawing is completed. Finally, the stroke() method is called to paint the drawn line with some pixels and this is the point where the sketchpad uses these four attributes:

lineWidth — to set the width of the line.
lineCap — to set the end of the line.
lineJoin — to set the connecting point between two lines.
strokeStyle — to set the color of the line.

Mouse Event Handlers:

First, let us focus on the mouse functions. The sketchpad_mouseDown() function is activated when the left mouse button is clicked. This event sets the mouseDown flag to true(1) and calls the draw() function with a false flag. This false flag indicates the draw function to note the point but not to draw.

function sketchpad_mouseDown() {
    mouseDown=1;    
    draw(ctx,mouseX,mouseY,12, false );
}

The sketchpad_mouseUp() function is activated when the left mouse button is released. This event sets the mouseDown flag back to false(0).

function sketchpad_mouseUp() {    
    mouseDown=0;
}

The sketchpad_mouseMove() function is activated when the mouse is moved at either condition: with the left mouse button held down or not. In both cases, it gets the position of the pointer using the getMousePos() function. Only if the mouseDown is true(i.e, left mouse button is held down), it calls the draw() function with a true flag. This indicates the draw function to draw a line to the current position of the pointer from the previous position.

function sketchpad_mouseMove(e) {
    getMousePos(e);
    if (mouseDown==1) {
        draw(ctx,mouseX,mouseY,12, true);
    }
}

The getMousePos() function is used to find the current position of the mouse pointer when the mouse event is triggered. The offsetX and offsetY properties return the x-coordinate and y-coordinate of the mouse pointer, relative to the target element. The layerX and layerY properties return the horizontal and vertical coordinates of the event relative to the current layer.

function getMousePos(e) 
{    
    if (!e)        
      var e = event;     
    if (e.offsetX) {        
      mouseX = e.offsetX;        
      mouseY = e.offsetY;    
    }    
    else if (e.layerX) {        
      mouseX = e.layerX;        
      mouseY = e.layerY;    
    } 
}

Touch Event Handlers:

The sketchpad_touchStart() function is activated when the user touches the sketchpad. This event calls the draw() function with a false flag. This false flag indicates the draw function to note the point but not to draw. The preventDefault() method is used to cancel the default action of the touch attribute. This prevents the scrolling of screen when the user tries to draw.

function sketchpad_touchStart() {     
    getTouchPos();    
    draw(ctx,touchX,touchY,12, false);    
    event.preventDefault();
}

The sketchpad_touchMove() function is activated when the user drags in the sketchpad. This triggers the getTouchPos() function and calls the draw() function with a true flag. This true flag informs the draw() function to draw a line from the previous position to the current position where the user is touching.

function sketchpad_touchMove(e) {     
    getTouchPos(e);    
    draw(ctx,touchX,touchY,12, true);    
    event.preventDefault();
}

The getTouchPos() function is used to find the point in the sketchpad where the user has touched. The e.touches.length property is used to find how many fingers touch the sketchpad. For proper functioning, we use single-point touch in the sketchpad. Hence if the value of the e.touches.length property is 1, the function returns the x-coordinate and y-coordinate of the point at which the user touches the sketchpad.

function getTouchPos(e) {    
    if (!e)        
    var e = event;     
    if(e.touches) {        
      if (e.touches.length == 1) {            
        var touch = e.touches[0];            
        touchX=touch.pageX-touch.target.offsetLeft;               
        touchY=touch.pageY-touch.target.offsetTop;        
      }
    }
}

Clearing the Sketchpad:

Once the drawing or prediction is done, the user may need to try a new one. Here we provide a feature to clear the sketchpad whenever the user wants using a clear button. Once the clear button is pressed, our sketchpad is cleared using the clearRect() function and it is filled with a black background using the fillRect() and fillStyle() functions. Thus the user is provided with a new sketchpad to draw.

document.getElementById('clear_button').addEventListener("click",  
                                             function(){  
    ctx.clearRect(0, 0, canvas.width, canvas.height);  
    ctx.fillStyle = "black"; 
    ctx.fillRect(0, 0, canvas.width, canvas.height);
});

Integrating the CNN Model:

Now that we have completed the sketchpad and it is working successfully, we are going to focus on the model integration into the web app. This process involves the following steps:

Loading the Model:

The base url of the website in which our web app is deployed is obtained using the window.location.origin property. The model files when uploaded to the server can be found only by using this base url. The JSON file of the CNN model is loaded using the async function.

var base_url = window.location.origin;let model;(async function(){  
    console.log("model loading...");  
    model = await tf.loadLayersModel(base_url +     
              "/models/model.json");  console.log("model loaded..");
})();

Preprocessing:

The digit drawn in the sketchpad is passed as an image to the model so that it predicts the value of it. This image before passing to the model should be preprocessed so that it is converted into the required format that can be processed by our model. We use tf.browser.fromPixels() method, to create a tensor that will flow into the first layer of the model. The tf.image.resizeNearestNeighbor() function resizes a batch of 3D images to a new shape. The tf.expandDims() function returns a tensor that has expanded rank, by inserting a dimension into the tensor’s shape. The tf.mean() function is used to compute the mean of elements across the dimensions of the tensor. The tf.toFloat() function casts the array to type float. The tensor.div() function is used to divide the array or tensor by the maximum RGB value(255).

function preprocessCanvas(image) { 
   
    // resizing the input image to target size of (1, 28, 28) 
    
    let tensor = tf.browser.fromPixels(image)          
                           .resizeNearestNeighbor([28, 28])       
                           .mean(2)  
                           .expandDims(2)  
                           .expandDims()  
                           .toFloat(); 
    console.log(tensor.shape); 
    return tensor.div(255.0);
}

Prediction:

When the Predict button is pressed, this function is triggered and the prediction process is carried out by the model. First, the image of the sketchpad with the user’s drawing is obtained using the canvas.toDataURL() function. This canvas.toDataURL() method returns a data URL containing a representation of the image in the format specified(default: png). This image is then sent to the preprocessCanvas() function where it gets preprocessed as mentioned above. Then it is passed into our model for prediction using await operator. This await operator makes the program wait until the model predicts a result. Then the array values of the prediction are passed to the displayLabel() function to display our result.

document.getElementById('predict_button').addEventListener("click",    
                                               async function(){     
    var imageData = canvas.toDataURL();    
    let tensor = preprocessCanvas(canvas);    
    let predictions = await model.predict(tensor).data();    
    let results = Array.from(predictions);    
    displayLabel(results);    
    console.log(results);
});

Output:

The displayLabel() function is used to display the predicted output in our web app. The index corresponding to the maximum of the array values passed by the prediction function is the predicted output of our model. The array value corresponding to that index is the confidence of the prediction. It is then passed to the HTML element via innerHTML property to display the result in the web app.

function displayLabel(data) { 
    var max = data[0];    
    var maxIndex = 0;     
    for (var i = 1; i < data.length; i++) {        
      if (data[i] > max) {            
        maxIndex = i;            
        max = data[i];        
      }
    }document.getElementById('result').innerHTML = maxIndex;  
    document.getElementById('confidence').innerHTML = "Confidence: "  
                                         +max*100).toFixed(2) + "%";
}

And that’s it, we have created the web application to predict the hand-written digits successfully. This can be used in the browser in a local machine. To publish it on the world wide web, we can use two tools: i) node.js for server-side scripting and ii) Heroku for hosting our web app on the world wide web.

Source Code: Github Repo

Demo: Heroku App

You can read Part I of this project here.

Expecting your feedback…!!!

Thank you and see you soon in the next project…!!!