The Essence of Artificial Neural Networks

6 min readMar 20, 2016

Everyone in the tech community has heard the recent news about AlphaGo beating the professional Go player Lee Sedol. It’s the first time a computer program managed to beat a professional human Go player.

What’s special about Go compared to other games (such as chess) is its high branching factor. A high branching factor means that the player has many different possibilities to choose from in order to make the next move. This makes traditional AI methods such as tree traversal ineffective. The number of branches from each node is simply too high for an algorithm to search through and decide which move is the best one to take.

This is where Artificial Neural Networks come in. AlphaGo would have a hard time beating Lee Sedol if it wasn’t based on Artificial Neural Networks (ANNs).

What are ANNs?

Artificial Neural Networks is an area within Artificial Intelligence that has taken most of its inspiration from biology, in particular the brain. The goal of an ANN is to learn some kind of unknown mathematical function by experience.

Imagine we have a very complex mathematical formula that requires a lot of computational power. We can train an ANN to approximate this function instead of calculating the exact result each time. The training could be done by giving the network some input values and observing the output values we get back from the network. An untrained network will give us incorrect, random values. We then show the network the expected output values for the input values we gave. The network will adjust its internal structure based on the expected output values. We now give new input values to the network and repeat the process a large number of times.

After each iteration the network will change its internal structure and become better and better at approximating the desired function. We stop the training when the outputs values are close enough to the expected output values for different inputs.

We can now give our trained network some input values it has never seen before and get closely approximated output values. We don’t need to waste computational power calculating the exact answers, instead we can get close approximations from our neural network.

And that’s the beauty of ANNs! They hide the complex processes by learning the connections between different inputs and outputs. The actual process from inputs to outputs is ignored all together!

This was an example of supervised learning which means that after each iteration we showed the network the expected output values and the network altered its internal structure based on these values.

Real life example

Let’s imagine that we want to predict weather. Weather depends on hundreds of different parameters (wind, humidity, pressure, etc …).

http://www.nar.ucar.edu/sites/default/files/ral/2012_images/2.3_Weather_Prediction_Statistical_Optimization_1.png

One way to predict weather could be taking all these parameters and trying to calculate the result based on the different physical processes that are involved. This would be a very complex and computationally intensive calculation.

Instead we can train an ANN to predict weather by giving it some weather parameters as the input values (humidity, pressure, wind, etc … ) and showing it the correct output value (sunny/rainy/cloudy) for these particular inputs. We repeat this process a large number of times. After this is done we’ll have an ANN that will be able to take some random weather parameters and predict the weather based on these parameters.

Our trained network learned the mapping between inputs and outputs. It has no clue about the actual physical processes that occur in nature.

The structure of ANNs

ANNs are greatly inspired by the animal brain. The units of calculation in an Artificial Neural Network are called neurons. The neurons are connected via synapses which are basically weighted inputs. Weighted input means that we take an input and multiply it by some specific weight for that input.

The network learns by adjusting these weights.

http://www.intechopen.com/source/html/39067/media/image1.png

Let’s take a closer look at a neuron in an ANN.

https://upload.wikimedia.org/wikipedia/commons/thumb/6/60/ArtificialNeuronModel_english.png/600px-ArtificialNeuronModel_english.png

The neuron takes a set of inputs (x1, x2 … xn) and each input is then multiplied with its corresponding weight (w1, w2 … wn). These weighted inputs are added together producing a net result. This net is later passed through an activation function (the activation function can vary depending on the ANN one wants to construct) which gives the output of this particular neuron.

https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Colored_neural_network.svg/2000px-Colored_neural_network.svg.png

The figure illustrates how an ANN looks like on a higher level.

The first layer is called the input layer which represents the inputs we provide to the network. The second layer is called the hidden layer and it’s here all the calculations happen. Finally the third layer is the output layer which contains the final calculated values from the hidden layer.

The connections between the layers are the synapses mentioned above. By altering the weights in these synapses the network is able to learn.

For example, increasing a specific weight will make the input that corresponds to that specific weight influence the output more. On the other hand, if we decrease a weight, the input that corresponds to that weight will have less impact on the final output. As the network learns it strengthens connections that will produce the correct outputs and weakens those that don’t.

This is an analogy of the brain plasticity in animals!

Deep Learning

The idea behind Deep Learning is to have multiple processing layers. One Deep Learning method is to have an Artificial Neural Network with several hidden layers.

http://www.rsipvision.com/wp-content/uploads/2015/04/Slide5.png

The figure illustrates a Deep Neural Network with 3 hidden layers.

Hidden layer 1 will make decisions based on the values from the input layer.

Hidden layer 2 will make decision based on the outputs from hidden layer 1. Because of this hidden layer 2 will operate at a higher, more abstract level compared to hidden layer 1.

Hidden layer 3 will make decision based on the outputs from hidden layer 2, operating at even higher and more abstract level.

Real life example

Deep Learning is used a lot in image recognition. The nature of several hidden layers operating at different abstraction levels is at the core of why Deep Learning is such a good technique for this particular use case.

http://www.nature.com/polopoly_fs/7.14689.1389093731!/image/deep-learning-graphic.jpg_gen/derivatives/landscape_400/deep-learning-graphic.jpg

Let’s imagine that we’re building a face recognition program.

We decide to use a Deep Neural Network. The figure illustrates how such a process could look like.

By having 4 hidden layers we can divide the problem into smaller parts.

In our case, hidden layer 1 would take the image and identify the light and dark parts of that image.

Then hidden layer 2 would use this information about light and dark parts of the image and based on that identify edges and simple shapes.

Hidden layer 3 would take the information about edges and simple shapes and identify complex shapes and objects.

Finally, hidden layer 4 would take that information and learn what of these shapes and objects actually define a human face.

AlphaGo and Deep Learning

Connecting back to AlphaGo! AlphaGo is another real life use case of Deep Learning. AlphaGo was built using deep neural networks and trained by supervised training with human experts. AlphaGo was also trained by reinforcement learning from games of self-play. As we all know this approach gave amazing results.

Conclusion

Artificial Neural Networks is a really fascinating topic. The connection between biology and Computer Science is what excites me the most. Although an ANN is somewhat a loose model of the human brain it’s still interesting that we can mimic the same principles that the nature has created and actually get impressive results.

The Essence of Artificial Neural Networks

What are ANNs?

Real life example

The structure of ANNs

Deep Learning

Real life example

AlphaGo and Deep Learning

Conclusion

Written by Ivan Liljeqvist