A gentle introduction to Artificial Neural Networks

Published in

How-Tos

5 min readOct 29, 2019

A pictorial representation of an artificial neural network. Source: Wikipedia

In technology today, the applications of artificial intelligence are very widespread. Today’s devices are getting much smarter than ever before and we owe a lot of that intelligence to neural networks. These networks are what enable devices to learn tasks like recognising objects in pictures, transforming speech to text, translating from one language to another and many more.

So what are they?

An artificial neural network (ANN) is a computing system that mimics the biological neural networks in the brains of animals. It trains itself to perform a task by learning from examples (usually provided by humans). The more good examples it has, the better it becomes at that task. Artificial neural networks have been successful at learning how to:

Detect and recognise objects in an image.
Transform speech to text and vice versa.
Play board and video games.
Translate text from one language to another.

and many other tasks.

Building a neural network

Now that we know what neural networks are and the roles they perform in our daily lives, let’s talk about how to build one. In this article, I will describe how to build a simple neural network using Keras, an open-source neural network library written in Python. It is a high-level API that runs on top of the lower-level ones like Tensorflow and Theano. It helps with setting up and building neural network projects quickly. We will be using Tensorflow as our low-level library for this project.

Let’s install these two dependencies:

pip install keras
pip install tensorflow

The neural network we will build will be simple, it is going to train itself to recognise handwritten digits like the ones below:

We will be working with the MNIST database which contains a lot of images with handwritten digits and their labels, that is, each image is labelled with the digit it contains. Our neural network will work with this database and learn from it until it gets to a stage where it can see new digits and correctly label them.

To begin, create a Python script and add the following code:

from keras.datasets import mnist(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

The code above imports the mnist database and then loads the data into four variables: train_images, train_labels, test_images and test_labels. These variables contain all the data we need to train and test the neural network.

More on neural networks

Now, let’s dig into the technical details of how a neural network works. A typical ANN has an input layer, one or more hidden layers and an output layer.

The input layer takes in the data which it is meant to train with, performs some computations on it and passes the results to the next layer (one of the hidden layers). That layer performs some computations on the input it got and passes the results to the next layer. This continues until we get to the last layer which gives a prediction of what the output should be for that particular input.

The entire process is repeated many times and then the network compares its predictions to the correct results, finds the average error and tries to find out how to minimise the error next time. It then starts all over again from the beginning by passing in new data to its input layer, but this time with the knowledge gained from the previous errors.

After a sufficient amount of training, it would (optimistically) have reduced the errors in its predictions so much that its predictions would be accurate most of the time.

I suggest watching this video to learn more about how neural networks work with a visual explanation.

Creating the neural network

In our case, the input layer will take in a 28x28 pixel image as its input (the actual input will be a vector containing the value of each pixel), and the output layer will give a prediction of what number is in the image.

A lot of the code below contains technical jargon that I won’t go into but works just as I have described in the section above. I recommend going through the Keras documentation to familiarise yourself with the library so you can better understand the code.

First, we will create a Sequential model (which is simply a stack of layers) that will serve as our neural network:

from keras import layers, modelsnetwork = models.Sequential()

Then we add 3 layers to it — one input, one hidden and one output layer:

network.add(layers.Dense(784, activation='relu', input_shape=(28 * 28,)))network.add(layers.Dense(784, activation='relu', input_shape=(28 * 28,)))network.add(layers.Dense(10, activation='softmax'))

Next, we compile the network model and specify that we will be using accuracy as our performance metric:

network.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

With the above, we have successfully built a neural network.

Training and testing the network model

Recall that we stored our training and testing data in four variables: train_images, train_labels, test_images and test_labels. Before we can make use of this data with the network, we need to first convert the data to a form the network can read. To do that, we will reshape and encode the data:

from keras.utils import to_categoricaltrain_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In the above code, we are making use of 60,000 images for training and 10,000 images for testing. Each image is of size 28x28. Also, the function to_categorical() helps us encode the labels.

We now have all the data we need in a form the network model can understand. We can now train the model by using the fit() method:

network.fit(train_images, train_labels, epochs=5, batch_size=128)

The batch_size represents the number of times the network predicts values before comparing with the correct results, while the number of epochs represents how many times the network repeats the learning loop, that is, how many times it completes the process of predicting values, comparing with the results and learning from the errors.

By calling the fit() method, the neural network trains itself with the training data. We can use the evaluate() method to test the accuracy of the network after training:

test_acc = network.evaluate(test_images, test_labels)[1]print('Test accuracy:', test_acc)

The output after running the tests is:

Test accuracy: 0.9807

which indicates that our neural network has an accuracy of 98.07%, quite impressive.

To test the network with your own data, you can load your test images and test labels (making sure each image has size 28x28) into the network and run the evaluate() method.

Thanks for following this far. I hope I have been able to demystify artificial neural networks so much that you are now able to build one by yourself. The code for this project can be found here, so you can use it as a starting point.

If you are unclear about anything or wish to comment on this article, please do so below.