Day 14 of 100DaysofML

Charan Soneji
100DaysofMLcode
Published in
9 min readJun 30, 2020

Image classification using TensorFlow from the absolute basics. I covered the essentials of TensorFlow in the previous blog so I thought of getting a hands on project which could go over the basics of TensorFlow.

So I’m going to be working on the MNIST dataset which is something that every individual works on when they start working on projects in DL. Ok, so for the ones who aren’t familiar with the MNIST dataset, it is a dataset which contains images of numbers written in different handwriting styles and it also has labels which identify which number is actually written in the picture and all this helps in the training phase of the model. Alright, thats a gist about the dataset, lets dive right in.

To run the program, make sure to have TensorFlow installed on your environment. You can use Anaconda to help out with the installation and maintenance of versions on your environment. But you could use Google colab, Kaggle or any other online online environment that could you help with the generation of the NN.

To understand the creation of the model and I’ll give a brief explanation as I go through each of the stages.

1. Understanding the problem: In my opinion, this is one of the crucial stages because people often start with the development of their neural network but are unsure on what they are trying to build. So let's take a min and try to understand the usage of the MNIST dataset and what exactly, we are trying to prove or achieve.

What we want to achieve.

In the given picture, you may see that we have a picture of a handwritten number 5 but we want our computer to recognize that the number is 5 using our neural network. As a last part of this given step, I’m going to import the libraries and I’m also going to import keras as the high level wrapper package that can help with the basic understanding of TensorFlow.
import tensorflow as tf
print("Using TensorFlow version {}".format(tf.__version)

2. Understanding the dataset: This is the next crucial step to identify the features and parameters based on which the training takes place. Fortunately, tensorflow consists of datasets that come along with its library, so we shall import the dataset of images along with the labels and store them in seperate variables.

from tensorflow.keras.datasets import mnist
(x_train,y_train),(x_test,y_test)=mnist.load_data()

We have imported our data over here and stored them in different variables. x_train refers to the pictures for training.
y_train refers to the labels for training.
x_test refers to the pictures for testing and
y_test refers to the labels for testing, They have all been saved from the load_mnist() function.

We can check the shape of the imported training and testing data using the .shape() function.

print('x_train_shape: ',x_train.shape)
print('y_train_shape: ',y_train.shape)
print('x_test_shape: ',x_test.shape)
print('y_test_shape: ',y_test.shape)
Mnist shape of dataset

After running the above code in the cell, we may notice that our training data has 60,000 images whereas our testing data has about 10000 pictures. This is sufficient for testing the overall accuracy of the model. We shall now import matplotlib.pyplot to visualize the dataset and see the data just to get an understanding.

import matplotlib.pyplot as plt
%matplotlib inline
plt.imshow(x_train[0],cmap='binary')
plt.show()
#OUTPUT SHOWN BELOW
Output obtained

The whole point here is that the DL model can understand the relation between this image and the label using complex mathematical computations.

3. One Hot Encoding: This is another topic I shall talk about more briefly in another blog but just to give an overall understanding, this is used in order to convert our text data into categorical or numerical values.
Take a look at the picture given below:

After One Hot Encoding

Here, for the given label 5, the 5th number shows 1 whereas for the given label 7, the 7th number shows 1 but the remaining numbers are 0.
Just for simplicity sake, we have a total of 9 possible labels because we have only 9 numbers.

Let us proceed with this straight forward process here.

from tensorflow.keras.util import to_categorical
y_train_encoded=to_categorical(y_train)
y_test_encoded=to_categorical(y_test)

Here, we convert using to_categorical function of tensorflow and we do it to only the y values since they are the text based values and they need to be converted to categorical ones.

We can check the encoding labels using the following piece of code.

y_train_encoded

It will return a 2D array and you can figure out the values based on the explanation I have mentioned above.

4. Neural Network: This is arguably the most important step in the entire process in the creation of the DL. It is the creation of our Neural Network.

Creation of our NN

In the above representation, we can see what our NN looks like and we have discussed about the individual components earlier as well i.e. its weights and bias and the vectorized representation is shown on the right.

One of the important things to understand that our image is 28px*28px in dimension and in order to feed it to our neural network, we need to convert it into a vector of 784 (28*28 =784) dimension. Thus, for our classification problem, we give in 784 different values to our node for the training purpose and multiply it with the required weights and add the bias to it.

784 features given to the node for training

We can’t use the above model because the model will still remain to be a linear network and the weights and bias will not get updated. This is why we use a neural network as mentioned in the diagram below.

MNIST NN

The difference in case of a neural network is in the activation function and the number of hidden layers (which increase the complexity of the Neural Network). Just to clarify, the input layer will have 784 features and the output layer would have 10 different labels (basically 10 Dimensional).

5. Preprocessing: A little bit of preprocessing is needed for every dataset that we work with. In this case, we need to convert our images into a vector of 784 dimension. We do that using numpy. Make sure to copy the code given below:

import numpy as np
x_train_reshaped=np.reshape(x_train,(60000,784))
x_test_reshaped=np.reshape(x_test,(10000,784))
print('x_train_reshaped',x_train_reshaped.shape)
print('x_test_reshaped',x_test_reshaped.shape)

The shapes would now be printed as (60,000,784) which shows that they have been converted into the desired vector.

Let us understand the values a bit. Type in the following code in your cell.

print(set(x_train_reshaped[0])
Output of reshaped

There are a total of 255 values which is how pixel values are denoted. The computations would be faster if normalize these values and it would help the DL out as well.

x_mean=np.mean(x_train_reshaped)
x_std=np.std(x_train_reshaped)
epsilon=1e-10
x_train_norm=(x_train_reshaped-x_mean)/(x_std+epsilon)
x_test_norm=(x_test_reshaped-x_mean)/(x_std+epsilon)

Here, we first calculate the mean and standard deviation from the training set using numpy inbuilt functions and then we define an extremely small value epsilon about which I will talk about in a second. We then move to the main normalization part whereby we apply the formula by subtracting the mean from the value and divide it by the standard deviation. Here, we add this epsilon to the standard deviation mainly because a very small value of SD may lead to instability in the calculations but adding this small value usually solves the problem.
You may notice that we used the mean and Standard deviation of the train dataset on the test dataset as well but we have done this mainly because our test set was derived from our train dataset and after we perform preprocessing on our testset, it may create unnecessary bias which may lead to anomalies. You can check out the normalized values by using:

print(set(x_train_norm[0]))
Normalized valued obtained

6. Creating our model: This is the last part whereby our input layer would contain 784 nodes for all the features and our output layer would contain 10 different nodes for our 10 different labels. We also need to keep in mind that each of the nodes would be connected to all the nodes in the previous layers in case of the hidden layers (also called Dense layer). We would create our model using the Sequential() class present in Keras. Let us import this class first.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

Next, we need to create our model.

model=Sequential([
Dense(128,activation='relu',input_shape=(784,)),
Dense(128,activation='relu'),
Dense(10,activation='softmax'
])

Here, within our model of Sequential class, we have created 128 nodes taking in an input shape of 784 in dimension and the activation function used is ‘relu’ or Rectified Linear Unit. Keep in mind that the input layer is defined in the first layer when we mention input_shape() and it is not mentioned in the next hidden layer because it is already defined in the previous one. The output layer has an activation function of softmax because that is what is needed in case of classification problems as discussed earlier. You can also play around with the number of nodes in case your are trying to experiment with your network.

Next thing we need to do is compile our model and specify the loss functions to reduce the loss and mention the comparison mechanism as ‘accuracy’ which is done using the following code:

model.compile(
optimizer='sgd',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.summary()

‘sgd’ refers to stochastic gradient descent algo which is used to optimize the algorithm by updating the weights and biases constantly in each propagation.

Model Summary obtained

The param column above shows the number of nodes it is actually connected to.

7. Training: This is done using the training set which we have defined and normalized and then the testing is done using the testing set because we do not want the model to predict from previously learned values.
For our given problem, I am going to use 3 epochs which means that we would go through the data 3 times. The syntax is:

model.fit(x_train_norm,y_train_encoded,epochs=3)

The training shall now start and the accuracy shall be displayed since we have mentioned the metric of evaluation as accuracy, it shall display the accuracy.

Accuracy obtained

We may see that the accuracy that we are obtaining is 95.82% which is pretty good but make sure to note that the accuracy that you obtain may be slightly different. Another thing that we need to check and note is to make sure that overfitting does not occur because then it could make the model “learn” values which is not what we want. Hence, we evaluate our model.

loss,accuracy = model.evaluate(x_test_norm,y_test_normalized)
Loss and accuracy of testing sample.

We thus obtain the loss and accuracy of the testing sample. Now we move onto the last step which is the prediction step.

8. Prediction: Here, we just go through a few values to make sure our model is evaluating or predicting correctly.

preds=model.predict(x_test_norm)
print('Shape of prediction:',preds.shape)

Here, as long as you obtain the shape to be of size 10000, you are fine and good to go because that is the size of the testing sample.

plt.figure(figsize=(12,12))
start_index=0
for i in range(25):
plt.subplot(5,5,i+1)
plt.grid(False)
plt.xticks([])
plt.yticks([])
pred=np.argmax(preds[start_index+i])
gt=y_test[start_index+i]
plt.xlabel('i={},pred={},gt={}'.format(start_index+i,pred,gt))
plt.imshow(x_test[start_index+i],cmap='binary')
plt.show()

Here, our data needed a bit of preprocessing before we outputted it so we created a subplot to generate all of our possible outputs and we notice one of our values to be wrongly labelled as 6 instead of 5.

For the given wrongly labelled point, we plot the graph for prediction and obtain this:

plt.plot(preds[8])
plt.show()
Softmax probability output

Here, we may notice that the probability is max at 6 which shows that it has been classified at 6. Hence it is possible that the values may be wrongly classified since they may look alike like in the case we may have seen for 5 and 6.

We have reached the end of this classification example and we have just scratched the surface for getting started with neural networks. Please do leave your feedback and clap if this helped you. Keep Learning.

Cheers.

--

--