Let’s code a Neural Network from scratch — Part 2

Part 1, Part 2 & Part 3

Charles Fried
TypeMe
4 min readMar 29, 2017

--

Now we’ve got our initial setup we can get onto the more interesting stuff. The next step is to complete the network by adding a hidden layer and the output layer. We will then construct the forward feeding structure that we discussed in part 1.

Firstly let’s expand on our “Neuron” class so we can initialise it in two ways, either as an input layer with no inputs other than the card it gets shown. Or as hidden or output layer which receives as input the output of the preceding layer. We also give is a some weights which are randomly generated (between -1 and +1). For this we need an array of neuron’s (m_input), weights (m_weights) in the initialiser.

Now, the following statement is crucial in understanding how the “respond()” function works as it is core to this ANN:

The input of the hidden and output layer is related to the weighted sum of the outputs of the preceding layer.

We can write this very simply like so, where the input is “in”, output from the previous layer is “o” and corresponding weight “w”.

Respond function

The total input into the hidden or output layer (input) is the sum of the output (inputs[i].output) of the preceding layer multiplied by the weights (weights[i]) in their current state.

It is by tweaking these weights that our network learns to recognise each digits. In a similar way that seeing the “70mph” caused us to slow down.

We complicate this a little further by passing the total input into “lookupSigmoid()” which is called an activation function. This will ultimately tell us whether the neuron should fire based on its input (preceding layer). Let’s look at this in more detail.

Activation Function: Sigmoid

Instead of firing in a binary manner (on/off) we want to “sway” the strength of the output in one way or another. When the weighted sum “input” computed in “respond()” is closer to -3 (x) then the output is around -1 (sig(x)) or not firing. Conversely, when it’s around +3 (x) then the output of “lookupSigmoid()” is +1 (sig(x)) and firing.

Sigmoid Graph @ Wikipedia

We can write the sigmoid mathematically like so:

Sigmoid function

“x” is the input and “y” the output, the nonlinear properties of this function means that the rate of change is slower at the extremes and faster in the centre. Put plainly, we want the neuron to “make its mind up” instead of indecisively staying in the middle.

Plotting the values inside of “g_sigmoid” from the code below

The sigmoid function is relatively computationally intensive so to avoid it from slowing us down let’s create an array of 200 readings in the “setupSigmoid()” which will get called in “setup()”. We can then find the corresponding value using the “lookupSigmoid()” function.

Our perceptron now looks like this:

Making the link

With the sigmoid out of the way and the ability to take inputs from the previous layer we’re now ready to link all the layer together.

Based on the structure of the network that we’ve discussed we’ll hard-code all three layers within the “Network” class. Notice how in the initialiser below each layer feeds from the previous one starting with the input layer.

So far we’ve only implemented the “respond()” function in the “Neuron” class, however to make this happen we need to call it from inside the “Network” class. It will be responsible for iterating through each neuron in network and making them respond.

Notice how the input layer simply copies the input from the “Card” class whereas the others call the “respond()” function located inside of the “Neuron” class.

Finally, let’s create a function to draw the network:

All we have to do now is to implement our changes in “setup()”. We simply load the sigmoid values and setup a network with 196 neurons in the input layer, 49 in the hidden layer and 10 in the output layer.

And here we have the full network setup, each layer responds to the previous ones, with weights which are initially randomly generated. All we have left to do is to figure out it how to tweak our weights for the network to predict the output which we’ll do using backpropagation in the next and final chapter.

It’s worth clarifying that the output layer won’t always give you one clear answer. Rather, it’s displaying the probability for each possible answers. We can then assume that it will most likely be the neuron indicating the highest probability (darkest colour).

The full code for part 2 can be found on GitHub.

If you’re enjoying this and want more please hit the 💚

Go to Part 3

The original algorithm is from Alasdair Turner

--

--