How to get top 2% position on Kaggle’s MNIST — Digit Recognizer

Applying CNN

This tutorial shows the use of a convolutional neural network model that was built and trained with Keras on top of Tensorflow.

This first part I’ll focus on Machine Learning model, parameters and results. The second part I’ll explain how to deploy this model as an API.

I trained the model on the MNIST dataset provided by Kaggle to produce good results in recognize handwritten digits.

“The MNIST database is a large database of handwritten digits.”

For those that are new to Tensorflow and Keras, I would recommend to start there trying some tutorials and play around with the code and then come back here.
For those that are entirely new to the subject of deep learning, I would recommend you to check out fast.ai, Cognitive Class AI, Siraj or Sentdex.

Before we start: Here are some hints on how I set up my workstation for this tutorial. Please make sure you have all of that running on your computer before you go on with this tutorial. The installation process might take some time so be sure you do not cross the plans of your significant other.

Prerequisites for this tutorial:

Alright if you made it to this point, some stuff will be familiar to you which is really good!

The Convolutional Neural Network Model

A convolutional neural network can have tens or hundreds of layers that each learns to detect different features of an image. Filters are applied to each training image at different resolutions, and the output of each convolved image is used as the input to the next layer.

In our case, each image data point is a representation of 28 pixels by 28 pixels image, for a total of 784 pixels. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive.

After some tests and researches I've reached this model:

Image for post
Image for post
CNN Model — Input Layer + Hidden Layers + Fully Connected + Output Layer

Let's code this model.

I prefer to keep the code organized, and to do that I put the model in a separated file and called later in the main program.

You can play with some parameters (filter size, kernel size, padding, optimizer, learning rate, dense size, and others). Let me know your results.

Okay, now the model is ready it is time to code the main part.

This is the program:

You might saw I've included graphs, elapsed time, and other visual results. I hope you have enjoyed this first part.

Results

Image for post
Image for post

Follow me on Medium or in my Github, I'll post the second part soon.

References:

https://www.mathworks.com/discovery/deep-learning.html

https://www.kaggle.com/c/digit-recognizer/

https://www.tensorflow.org/versions/r1.1/get_started/mnist/beginners

https://www.tensorflow.org/versions/r1.1/get_started/mnist/pros

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store