Neural Networks : Learning How To Build An Image Classification Model

Rishab Ghanti
Sep 1, 2018 · 8 min read

Understanding Machine Learning and how it is pushing the world towards achieving general AI seems to be a daunting experience for a novice due to the number of algorithms and techniques available. Me being a self taught ML enthusiast would like to do my bit to share my learnings and experiences inorder to help other enthusiasts like me understand ML better.

In this article I will focus on a popular technique of Machine Learning called Neural Networks. I will provide an introduction to neural networks and also state how large tech companies are using this to provide a better service to their users. The article will also describe how a neural network learns using a classification problem as an example. In the end I will provide an example along with the source code to build an image classification model using the MNIST dataset.

Introduction

Neural networks are computing systems vaguely inspired by the biological neural networks that constitute the human brain. Such systems learn to perform tasks by considering examples, generally without being programmed with any task-specific rules. For example, in image recognition a model can learn to identify images that contain cats by analyzing tons of example images that have been manually labeled as “cat” or “no cat”. They do this without any prior knowledge about cats — cats have fur, tails, whiskers etc. There is a ton of information about neural networks available online, I will not take more time in introducing the topic so we can jump into its applications.

Applications

Below are features powered by neural networks developed by large tech companies :

  • The feature where Facebook highlights faces and prompts friends to tag on photo upload (Image Recognition).
  • Deposit cheque through mobile (Image Recognition).
  • Amazon — “customers who viewed this item also viewed” (Recommendation System).
  • Smart Personal Assistants — Google assistant, Siri, Alexa (Speech Recognition and Natural Language Processing.

How Does A Neural Network Learn?

Neural networks mimic how the brain operates. A perceptron is the smallest building block of a neural network.

Fig 1 : Neuron — primary component of the central neural network

Perceptrons function similar to the neurons in the brain. Like perceptrons, neurons in the brain get inputs from the dendrites and the nucleus performs some operation on the input to decide if it outputs a nervous impulse or not through the axon as shown in Fig 1.

To understand how a neural network learns from examples, let’s consider a classification problem where we need to predict if a student gets accepted or rejected by a University. In order to evaluate the student we have their test score and grade score from school.

Consider the student has a test score of 7/10 and a grade score of 6/10.

Fig 2: Graph of test score V/S grade score

To solve a machine learning problem we need to look at the previous data which is shown in Fig 2. This is a graph of test scores VS grades of students who have applied to that University. Blue points represent students who have been accepted and red points represent students who have been rejected. We can see that a student having a good test and grade score has a higher chance of getting accepted.

This data can be separated by a straight line as in Fig 3 and this line is called the model. The data point that we need to predict, which has been marked by a question mark is situated above the line along with the blue points. Hence we can say that the student gets accepted.

Fig 3 : Model — line separating the blue and red points

Now let’s write an equation for this model. Consider we represent the x-asix as x1 and y-axis as y1. Considering the general formula y = mx +b, the equation of the line can be written as 2x1+x2–18=0. This equation has been represented in the form of a perceptron in Fig 4 with x1 and x2 replaced by 7 and 6 which are the scores of the student we are predicting for.

Fig 4 : Perceptron

The output of the perceptron is the score and since the score is greater than 0, we can predict that the student will be accepted.

Since the data in Fig 2 could be separated by a straight line, the model could be represented by a single perceptron. But when the data is more complicated like in Fig 5, more number of perceptrons will have to be combined to obtain a model to fit the data.

Fig 5 : Dataset cannot be separated by a straight line

An example of a neural network that can be used to classify such a dataset is shown in fig 6. 3 models are combined to obtain the curve to classify the data.

Fig 6 : Neural network to classify complex data

Deep Learning To Recognize Handwritten Digits

In this section we will see how to code an image classification model to identify handwritten images of numbers. The dataset I have used to train my model is the MNIST Dataset. It consists of 70,000 greyscale images of handwritten digits each depicting one of the digits from 0–9.

Fig 7 : MNIST dataset

Below I will show each step in coding the model along with the source code to perform the step in Python.

Load The Data :

For this project I have used Keras which is an open source neural net library. It is a good place to start learning and implementing neural net models for beginners.

Fig 8 :Loading the MNIST Dataset

Data Preprocessing :

Each image in the MNIST dataset is 28 pixel high by 28 pixel wide and hence the computer see it as a 28*28 dimensional matrix. Every white pixel is encoded as 255 and a black pixel as 0 and grey pixels are an integer between 0 and 255. As a quick pre-processing step we will recycle the images to have values between 0 and 1. Hence we’ll divide every pixel in every image by 255.

Fig 9 : Normalize the data to have values between 0 and 1

The second part in the preprocessing step is one hot encoding as in Fig 10. It is a process by which categorical variables are converted into a form that can be provided to ML models.

Fig 10 : One hot encoding

To be able to supply this matrix to the model, it has to be converted into a vector.

Building the model :

Each input vector is of size of 64, hence the input layer will have 64 nodes in it. This is a multi-class classification problem since the output has to be one of the 10 digits between 0 and 9. Thus, the output layer has 10 nodes as shown in Fig 11. There is no definite formula to judge the number of hidden layers to add to the model, you can start with any number of layers and modify it to improve the accuracy. When a neural network has multiple hidden layers it is called a deep neural network.

Fig 11 : Structure of the model

The above model can be created in Python using the code snippet below.

Fig 12 : Code to build the model
Fig 13 : Compile the model

I have calculated the accuracy of the model before training inorder to illustrate the improvement after training it.

Fig 14 : Accuracy of the model before training it

Training The Model :

The next step is to train the model using the MNIST dataset which can be done in Python as below.

Fig 15 : Train the model

Epoch is the number of times the complete dataset is supplied to the model during training. In the above code snippet since I have set the epoch to be 10, the entire dataset will be run through the model 10 times. The model with the best accuracy among these 10 epoch will be loaded on to the model in the end of the training process as in Fig 16.

Fig 16: Load the most optimal weights onto the model

Testing The Model :

The last step in this example is to test it and calculate the accuracy. The model is evaluated by running the testing data through it and the accuracy is calculated.

Fig 17 : Testing the model

As can be seen above, the accuracy of the model is 98.09% which means that it will predict the right answer 98.09% of the times. Accuracy of the model before training was 9.36%. Hence, we were able to improve the accuracy of the model by large percent by training it.

Conclusion :

In this article we saw how neural networks work and how to code one in Python. Neural Networks and Machine Learning in general is a booming technology and this is a great time to get into this field. This is going to be the next big thing if it already is not!
There are several resources available online to teach yourself ML. Please feel free to get in touch with me know the about my journey of learning ML and the materials I used.

Hopefully this article has helped you understand neural networks better and has given you an insight on how simple it is to code a neural network to build your hobby project! :)

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade