Thinking artificially: Artificial Neural Networks, part I

Andy Elmsley
The Sound of AI
Published in
4 min readMar 27, 2019

--

The latest tutorial in our AI coding series.

Welcome back, neurons. This week we’re introducing a new AI topic. Over the next few tutorials, you’ll learn the basic theory behind Artificial Neural Networks (ANNs) — a powerful and versatile AI model — and how to build and train one from scratch.

You could just use a library like TensorFlow to create and train a powerful, ‘deep’ ANN in a few lines of code. While handy for creating efficient ANNs, this isn’t ideal if you want to really understand the system and how to build them yourself. So let’s go deeper, and explore the wonderfully simple-yet-complex world of ANNs.

A visual representation of an ANN.

What is an Artificial Neural Network?

ANNs are vaguely inspired by the way biological brains like our own work. Our brains are made up of individual cells called neurons. On their own, these neurons don’t do that much; they can receive some electrical input and become activated with some output. But when these neurons are connected together, they exhibit higher-level behaviour that gives rise to all of our cognitive processes and abilities such as thought, consciousness and memory. While the human brain has around 100 billion neurons, ANNs use far fewer, massively simplifying matters. Even the deepest of ‘deep’ networks in use today only contain around one million neurons. ANNs simulate the behaviour of their biological counterparts with nonlinear mathematical functions that map a given input onto a desired output.

ANNs consist of the following elements:

  • An input layer;
  • One or more hidden layers;
  • An output layer;
  • A set of weighted connections between the layers; and
  • An activation function for each layer (except the input layer).

Artificial neurons under the microscope

Neurons are the computational units of an ANN — its building blocks. An artificial neuron, like its biological counterpart, receives one or more inputs, adds them up and passes that sum through an activation function to produce an output.

An artificial neuron

Let’s see how this works mathematically. We have the following elements to consider:

  • xk are the k input signals (from other layers);
  • wk are the weights of the connections between the inputs and the neuron;
  • f is the activation function

The output of the neuron is y = f(x1w1 + x2w2 + … + xkwk).

There are several types of activation functions we can use in ANNs. ReLU, tanh and sigmoid are some of the most common. For now, we’ll be using the sigmoid function, a beautiful nonlinear function, with some nice mathematical properties like differentiability and continuity:

The sigmoid activation function and its output.

An artificial neuron can be simply implemented like so:

Connecting the dots

Single neurons by themselves are not particularly powerful processing units. The strength of a NN becomes clear when we connect several neurons in a layered network.

ANNs consist of a number of layers. The first layer is called the input layer. Then we have at least one, but often more, hidden layers. Finally, there is an output layer. One of the simplest ANNs is called a Multilayer Perceptron (MLP). This is a type of feedforward network, meaning that the data is processed in one direction, from input to output. We start from the input nodes, moving through the hidden layers and, eventually, arriving at the output layer.

Architecture of a Multilayer Perceptron.

In an ANN, each connection between the layers is weighted. Usually the network will undergo a process of training, which is nothing more than iteratively tweaking the values of the weights of the connections, so that the network behaves the way you want it to. We’ll cover how to train our networks next week, but for now let’s look at an object-oriented implementation of an MLP.

Brain freeze

That’s it for this week! We learned about ANNs and their architecture, and implemented our first ANN — an MLP. Right now the network isn’t very useful, because it just produces noise. To do anything exciting, we need to train it — and that’s exactly what we’ll do next time.

In the meantime I have a challenge for you: the MLP implementation above only works with one hidden layer. Can you extend it to have any number of hidden layers that differ in size?

We’ll go through the solution to this next week, and as always you can find the source code on our GitHub.

To begin your AI-coding training at day one, go here.

And give us a follow to receive updates on our latest posts.

--

--

Andy Elmsley
The Sound of AI

Founder & CTO @melodrivemusic. AI video game music platform. Tech leader, programmer, musician, generative artist and speaker.