A more complex Neural Network

A Simple Neural Network for image classification

Luca Barone
Mar 2 · 6 min read

Introduction

In the previous article, I showed how to create the simplest Neural Network Possible, with just 1 input node, 1 hidden node and 1 output node, in this case the Neural Network will be slightly more complex and will be used for image classification.

Image Classification

Image recognition is the ability to detect objects in the image and recognize or classify them into one of several classes.

Since we are born, we are trained with thousands of images every day without even realizing it, for this reason, for us it’s easy to distinguish a dog from a cat or any other animal. However, it is not so easy for a computer to imitate this process.
Computer views an image just as an array of numbers that represent how dark each pixel is and they try to look for patterns to recognize and distinguish key features in the image.

Implementation of a Neural network model for image classification

, from a common dataset called MNIST.In the TensorFlow library there are some datasets diretly available in the tf.keras dataset API like the MNIST handwritten digits dataset, often used as the hello world of machine learning programs.

The (60.000 for training, 10.000 for testing) , thus having each image their own label, representing images of numbers from 0 to 9 at low resolution (28x28 pixels).

First we need to import the TensorFlow library:

# Import
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

Now we can load the dataset, and, as said before, it is divided in test and train images each with its own label, so we need to save these data in 4 variables:

mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Data Visualization

Visualizing our data and understanding what it contains is always an important step for every machine learning application.

First we can print the by using the

train_images.shape(60000, 28, 28)

So we confirm the images are 60k with a dimension of 28x28 pixels. Let’s try now to print the first value of the training images:

import matplotlib.pyplot as pltplt.imshow(training_images[0])print(training_labels[0])print(training_images[0])
Example of a training image(left) and how it is read by the computer as an array showing pixels intensity(right).

As we can see, the values in the number, that express the pixel intensity, are between 0 and 255, but .

training_images = training_images / 255.0test_images = test_images / 255.0
Example of number 5 image after normalization.

Model creation

Let’s now design the model.

Compared to the model built in the previous article is going to be just slightly more complex, but many things are

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),tf.keras.layers.Dense(128, activation=tf.nn.relu),tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

Herewe first define the model as sequential, meaning that it will be a neural network with a sequence of layers. The first layer is called , it allow us to “flatten” a matrix to 1 dimensional array.

Example of Array flattening.

Thanks to the flatten layer, we can now transform every image…. this array as input for our layer, that represents a layer of neurons in the network. Every dense layer needs an , to tell them what to do.

Now you may ask me “ why didn’t you talk about the activation function in the previous article?”. Well, actually when we use the Dense layer, if the activation function is not specifiyed its default value is : a(x) = x.

In this case, because the first layer of the model needs to learn what a digit is during training, we use the as activation function. Mathematically the ReLU function, returns x for all values of x > 0, and returns 0 for all values of x ≤ 0.

The output, that will be the estimation of the class to which the digit belongs to, will be the input to the second dense layer that will use the activation function which will give as output the class with the highest probability.

After defining the model, we need to compile the model with , this function is used to configure the loss, optimizer and metrics of the model.

model.compile(optimizer = tf.keras.optimizers.Adam(),loss = ‘sparse_categorical_crossentropy’,metrics=[‘accuracy’])

With the command we can review a the summary of the model: the layers, shape of the inputs/outputs and the number of parameters it handles:

model.summary()

As we can see from the output of the model summary,

Model Training

. During this phase the model will try to fit the training data to the training labels or, in other word, figure out the relationship between the training data and its labels, so that when giving new images to the model it can make a prediction to which class that image belongs to.

model.fit(training_images, training_labels, epochs=5)

This high accuracy has been reached in just 5 epochs, so we didn’t overfit the model, and the training was pretty fast despite working on 60.000 images.

So now let’s see how it handles unseen data using the call , passing the test set and its labels as parameters, it will report back the accuracy on this

model.evaluate(test_images, test_labels)

As expected, the accuracy on unseen data is a little bit worse

Reflections on the model

So, what we did was feeding raw pixels into the neural network that worked to build image recognition. With an accuracy of 97.6% the digit recognizer does work really well on simple images where the letter is right in the middle of the image, but . Just the slightest position change ruins everything.

That’s where Convolutions are very powerful. A convolution is a filter that passes over an image, processing it, and extracting features that show a commonality in the image.

What’s next?

In the next article I’ll exaplain what are the challenges that we face with image recognition and why fully-connected Neural Networks are not the best model for this task.

Full Code

Here you can see the full code or try it through Google Colab:

MLearning.ai

Data Scientists must think like an artist when finding a solution

Sign up for AI & ART

By MLearning.ai

A weekly collection of the best news and resources on AI & ART Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Luca Barone

Written by

APPLE Develope Academy Student, love travelling, love learning, love languages (Ita, Eng, Spa, Jap)

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store