Build an Artificial Neural Network from scratch to predict Coronavirus infection

Son Nguyen
9 min readFeb 1, 2020

--

Disclaimer: This article is just for the purpose of educating about Artificial Neural Network and also for fun. I’m not responsible for your usage of the model I am about to provide in this article for any purpose, period.

TL;DR

As you know the Coronavirus outbreak is spreading very widely and quickly all around the globe. For the purpose of predicting whether you are Positive or Negative with this deadly virus based on the number of times you have coughed and the number of times you sneezed per hour, I am about to build an Artificial Neural Network from scratch to help you check your condition every day :)

How to Predict Coronavirus infection?

1. Artificial Neural Network Introduction

Neuroscientists estimate that our human brain contains tens of billions of Neurons like this:

Figure 1. A Neuron (Image source)

Each Neuron connects with many neighbor neurons to exchange electronic signals and as a result making up a Neural Network like this:

Figure 2. A simulated Neural Network (Image source)

This Neural Network plays a very important role in processing input information coming from the outside environment (via five senses). For example, when we look at a pet, we can recognize almost immediately whether it is a dog or a cat or a parrot. Neuroscientists explain that this is because our Neural Network worked out the answer.

The question is: how does our Neural Network work out the answer so quickly? The hypothesis is that because we were taught how to distinguish pets in the past (by our parents or by our friends or by teachers in school, etc). So our Neural Network learned how to classify pets and stored the lessons learned somewhere in our brain’s memory. When our Neural Network receives an input of pets, it uses the lessons learned to work out the answer quickly.

Based on this idea, Computer Scientists built up an Artificial Neural Network (ANN) consisting of layers of artificial Neurons like this:

Figure 3. A typical Artificial Neural Network (Image source)

The above Artificial Neural Network is composed of 3 layers: INPUT, HIDDEN, and OUTPUT. These 3 layers have 3, 4 and 2 Neurons, respectively. Each Neuron in the INPUT layer connects with all Neurons in the HIDDEN layer and each Neuron in the HIDDEN layer connects with all Neurons in the OUTPUT layer. The number of HIDDEN layers and the number of Neurons in each HIDDEN layer are arbitrary, depending on your choice.

2. Build an Artificial Neural Network from scratch

For simplicity, let’s start with the following simple Neural Network:

Figure 4. A sample Neural Network

2.1. How does an Artificial Neural Network work?

The INPUT layer: There are 2 Neurons in this layer. The first one has value x₁, the second one has value x₂ and they don’t perform any task. They just pass their values to the connecting Neurons in the HIDDEN layer. That’s all.

The HIDDEN layer: There are 2 Neurons in this layer and each Neuron receives 2 inputs x₁ and x₂ from the INPUT layer. Also, there is a weight for each link between an INPUT Neuron and a HIDDEN Neuron. For example, the link (x₁, h₁) has a weight w₁, the link (x₂, h₁) has a weight w₂. Each HIDDEN Neuron performs two tasks: calculate Sum and pass Sum through an Activation Function:

Figure 5. Two tasks of a Neuron: Sum and Activation Function

The formula for calculating Sum is the following:

where * is the multiply operation, b₁, and b₂ are biases.

A commonly used Activation Function is the sigmoid function:

The sigmoid function takes in an arbitrary value and outputs a value in the interval (0, 1).

So the HIDDEN Neurons produce the following outputs:

The OUTPUT layer: There is only one Neuron and this Neuron also performs two tasks: calculate Sum and pass Sum through an Activation Function like the Neurons in the HIDDEN layer:

The process of performing all operations as we have just done above to obtain the final output is called Feedforward. Summarily, an Artificial Neural Network takes in inputs, and then Neurons in the HIDDEN layers perform their tasks to produce intermediate outputs and finally Neurons in the OUTPUT layer perform their tasks to produce the final results.

Alright, that’s how an Artificial Neural Network works. Let’s implement the Artificial Neural Network as shown in Figure 4:

Next, instantiating a Neural Network instance, and then pass an example input to it:

We get the following result:

Weights:
w1 = 0.09396105470442227,
w2 = -1.5045583206059883,
w3 = 0.211750414394884,
w4 = 1.5204933625854835,
w5 = 0.005297873241540633,
w6 = 1.3658159459493195

Biases:
b1 = -0.2479382678218179,
b2 = -1.9410049921459698,
b3 = 2.6657240308781005

Outputs:
o1 = 0.982561556091888

2.2. How to train an Artificial Neural Network?

As you can see in the above subsection, an Artificial Neural Network takes in inputs, processes it and produces outputs. But what do outputs mean? How to evaluate how good outputs are? Usually, outputs are predictions and to evaluate the preciseness of predictions, we need a specialized tool — a commonly used tool is Mean Squared Error:

Where Yᵢ are Ground Truth results (real results) corresponding to a real input, Ŷᵢ are predictions that the Neural Network inferred. In Machine Learning, one calls MSE the loss function and denotes it as L = MSE.

Now let’s take some intuitive examples. Suppose that we have a ground-truth dataset about patients infected with the Coronavirus in Wuhan, China:

Table 1. A sample of a ground-truth dataset

We have the first input x = [15, 3] with real corresponding result y_true = 1. Pass this input to our Neural Network, we will get a prediction y_pred = o₁ as follows:

Outputs:
o1 = 0.9824945598864127

So y_pred = o₁ = 0.9824945598864127 and therefore, loss function:

The less value of L the more precise prediction y_pred is. The ultimate goal of training an Artificial Neural Network is to make the loss function as small as possible as for the whole ground-truth dataset.

Because

Obviously, if y_pred changes, L will change accordingly. So we can say L is a function of the variable y_pred:

Moreover, from the equations in subsection 2.1, we have

Therefore, we can say L is a multivariable function of weights and biases:

Obviously, if we nudge one of the weights or biases a bit, L will change accordingly. So to make L as small as possible we need to find particular weights and biases at which L is minimum. How could we do that? Fortunately, there is an optimization algorithm that we can utilize to find particular weights and biases. That optimization algorithm is called Stochastic Gradient Descent.

Stochastic Gradient Descent is pretty simple:

  • Step 1: Initialize random weights and biases for the first item in the ground-truth dataset.
  • Step 2: Calculate gradients (partial derivative):
  • Step 3: Update weights and biases using the following formula:

where η is a constant and it is called learning rate.

  • Step 4: Repeat Step 1 for the next item in the ground-truth dataset.

The process of calculating gradients and updating weights and biases is called Backpropagation because it propagates information about the error back to the entire neural network.

Alright, let’s do some math to work out gradients for our Artificial Neural Network. With the support of the Chain Rule, we can represent gradients as follows:

Because

and the only h₁ depends on w₁, so continue applying the Chain Rule, we have

Because the sigmoid function has a derivative like the following:

we can represent

So we have just worked out the first gradient:

Similarly, still referring to the equations in subsection 2.1 and using the Chain Rule, we can derive gradients for the rest of the weights and biases as follows:

Now that we have obtained all the necessary tools to be able to train our Artificial Neural Network. Let’s implement the training process:

The training process is implemented in the method train(…) starting from line 43. Usually, to find out the optimal values of weights and biases, we need to perform Backpropagation (update weights and biases) many times again and again. But because the size of the ground-truth dataset is very limited, so we need to loop through the ground-truth dataset epoch times as implemented at line 54. Also, to help evaluate our Neural Network after the training process is finished, we collect minimum loss, average loss and maximum loss for each epoch and return them at the end.

Now it’s time to start training our Artificial Neural Network using the sample ground-truth dataset as shown in Table 1.

The above script will train our Neural Network with 100,000 epochs and a learning rate = 0.001. After the training process finishes, we serialize our Neural Network to disk as a binary file model.bin for later use. Execute the script for a while, we got the following figure:

Figure 6. Minimum loss, Average loss, and Maximum loss

The figure shows 3 curves of Minimum loss, Average loss, and Maximum loss with respect to every epoch. If you execute the training script on your PC, you may get 3 curves with different shapes than mine as initial weights and biases are random values.

As shown in Figure 6, Minimum loss, Average loss, and Maximum loss have reduced as Epoch grew larger and they all have converged to a very small value. This means that the weights and biases of our Neural Network have reached to almost optimal values. These optimal values represent what the Neural Network has learned from the ground-truth dataset, in other words, they represent a model of the correlation between inputs (the number of coughs per hour and the number of sneezes per hour) and corresponding results (Positive or Negative with the Coronavirus).

3. Predict Coronavirus infection

To use our pre-trained Neural Network to predict Coronavirus infection, we need to write a small script to load the binary file model.bin from disk into memory using the module pickle like this:

To get a prediction of Coronavirus infection, you need to provide the script with two arguments, first, the number of times you have coughed per hour, second, the number of times you have sneezed per hour. For example,

$ python predict.py -c 25 -s 10
Warning! You are 96% POSITIVE with Coronavirus.

So if you have coughed 25 times and sneezed 10 times per hour, you have probably infected with Coronavirus. You should wear 2 masks and glasses and then go to the hospital immediately :)

4. Conclusion

This article aimed to provide you with a deep understanding of an Artificial Neural Network going from the Neuroscience concept → Mathematical Model → Computer Science (Python programming) → Proof of Concept (PoC). I hope that this article is helpful to you. The full source code of the article was pushed on GitHub. If you find any mistake in the article, please drop a comment below or tweet me @ngocson2vn

--

--