Recognizing Handwritten digits with TensorFlow

DeepLearning is a subfield of machine learning that is a set of algorithms and functions inspired by the structure and fucntioning of the brain.

TensorFlow is a machine learning framework that Google created and is used to design, build and train deep learning models.

This tutorial is an attempt on the MNIST dataset from this Kaggle competition while also explaining the basics of writing TensorFlow code.

Getting Started

  • Enter the challenge
  • Download the dataset from the competition

Import Libraries and Load MNIST Data

Setup Constants and HyperParameters

Preparing Data

  • One hot encoding of the labels
  • Reshaping into image shape(# images, # IMAGE_WIDTH, # IMAGE_WIDTH, # COLOR_CHANNELS)
  • Splitting into train and validation sets

Architecture of our model

Let’s now build a network with two convolutional layers, followed by one fully connected layer.

Initialize the data with placeholders:

We start building by creating nodes for the input images and target output classes.

Weight Initialization

To create this model, we are going to need to create a lot of weights and biases. We’ ll build two handy functions to do this for us.

Convolution and Pooling

TensorFlow gives us a lot of flexibility in setting up convolutions and pooling operations.Our convolutions uses a stride of one and are zero padded so that spatial dimmensions are maintained in the output.Our pooling is the plain max pooling over 2x2 blocks with stride 2.We’ll also build functions for these.

The Model

First Convolutional Layer:

We can now implement our first layer.It will consist of convolution that will compute 32 features for each 5x5 patch.

Second Convolutional Layer:

Our second conv layer will have 64 features for each 5x5 patch.

Fully Connected Layer:

Our image is now of size 7x7 and we will now add a fully-connected layer with 1024 neurons.

Readout Layer:

We add a layer just like for the one layer softmax regression above.

Train and Evaluate the model

How well does this model do?

We will use tf.Session to create a session to run our model and log after every 500 iteration of our training process

tf_los=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits(tf_data),labels=tf_labels))tf_acc = 100*tf.reduce_mean(tf.to_float(tf.equal(tf.argmax(tf_pred, 1), tf.argmax(tf_labels, 1))))tf_opt = tf.train.RMSPropOptimizer(LR)
tf_step = tf_opt.minimize(tf_loss)
ss = ShuffleSplit(n_splits=STEPS, train_size=BATCH)
ss.get_n_splits(train_data, train_labels)
for step, (idx, _) in enumerate(ss.split(train_data,train_labels), start=1):
fd = {tf_data:train_data[idx], tf_labels:train_labels[idx]}
session.run(tf_step, feed_dict=fd)
if step%500 == 0:
fd = {tf_data:valid_data, tf_labels:valid_labels}
valid_loss, valid_accuracy = session.run([tf_loss, tf_acc], feed_dict=fd)
history.append((step, valid_loss, valid_accuracy))
print('Step %i \t Valid. Acc. = %f \n'%(step, valid_accuracy))

The final test data accuracy after running the entire code comes around approximately 98.7%

For the complete script and the Jupyter Notebook, head over to my github repo.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store