How Machines ‘Learn’ — AI: Explainer and Examples

Opening up Neural Networks

Published in

Analytics Vidhya

7 min readJun 14, 2020

Source: https://images7.alphacoders.com/925/thumb-1920-925903.jpg

Where’s Wally? Let’s scan the faces: “no no no no no no no.. yes!” From your eyes taking in the visual stimuli, you can associate whether the faces in the sea of people belong to the one you’re looking for or not.

You have just performed a function to identify a particular face — something that we are evolutionarily fantastic at. Your smartphone may have a facial recognition unlock function based on a neural network that maps an image of your face to your identity. A neural network is a type of mathematical function that maps a given input to a desired output.

Source: Calculus Explained with pics and gifs

So how does a neural network learn to do this? Like trying to make a dog play fetch we must train it!

For comparative purposes; a toddler may have the capacity to learn what a puppy is after seeing one or two examples..
Whereas a data hungry neural network takes a lot of time and energy to learn feature representations of a particular class after seeing up to tens of thousands of samples during training.

Source: https://analyticsindiamag.com/how-to-create-your-first-artificial-neural-network-in-python/

This is madness! What is this training you speak of? There are two primary components of training a neural network.
First we have forward propagation, in the beginning this is essentially a guess as to what the correct answer/output might be. I also talk about how this occurs in computer vision in my first article.

Then we compare the correctness (or error) of our guess to the actual answer and based on that, recursively update parts of the network backwards so that future guesses are more accurate. This step is called backpropagation.

Over multiple iterations our network gets better and better at guessing the right answer.. One might even say it is making predictions! ;o

Keep in mind the initial intuition behind a neural network is loosely to replicate the firing of neurons in the brain, reinforcing the firing of particular sequences of neurons in response to a given stimulus.

Source: https://i.pinimg.com/originals/23/b1/1b/23b11ba3f0760cd585e5692bca858ed5.gif

Let’s introduce the components of a neural network:

Source:https://missinglink.ai/guides/neural-network-concepts/neural-network-bias-bias-neuron-overfitting-underfitting/

Input Layer, x
Hidden Layer(s)
Output Layer, ŷ
Weighted Connections, Wᵢ and biases, bᵢ between each layer

You can see these are connected as an acyclic graph, where neurons in each layer have weighted connections — synapses to the next layer. These weights amplify or dampen the response of it’s inputs.

Additionally we do not count the input layer, so the figure you see above is a two-layer neural network; we have one output layer and one hidden layer.

As for an individual neuron’s input, what this does is:

Multiply the inputs by the weights
Sum these values and add the bias
Pass the entire value through an activation function

And that’s Forward Propagation! What we’ve done is pass input values through our network to get an output value, voila.

For those more interested in the calculus behind this you check out the Appendix at the bottom.

Now we determine how close our model output ŷ is to the actual value y. For this we use a loss function. One simple loss function is taking the sum of squared errors over our samples:

Recall that in training we want to optimise the weights and biases to minimise the loss function. To do this we propagate the error backwards using a popular method called gradient descent. Again the calculus behind this will be covered in the Appendix for those interested.

Source: https://media.giphy.com/media/O9rcZVmRcEGqI/giphy.gif

With gradient descent we backpropagate the error through our model neural network to update the weights and biases which incrementally reduces the error of our loss function until we can do so no more, reaching the local minima.

Whilst there are other nuances that can be considered such as the learning rate, overfitting, normalisation etc.. and much more to get stuck into. We have just gone through how a neural network is able to ‘learn’ a function which takes an input and maps it to a desired output!

Source: Hello Neural Network — The Beginner’s Guide to Understanding Neural Networks

And now for the examples! In our everyday life we interact with perhaps more than we think and this is likely to increase in the future, below are a few:

Targeted Advertising: Using your details and information as input parameters, say age, sex and location, one could train a neural network to determine which adverts will have the highest engagement rates with personalised marketing.

Automated Chatbots: Faster interactions in online chat experiences to answer your questions and upgrade user experiences. This falls under the natural language processing domain and is offered as a service by Microsoft.

Credit Rating: Determining the risk of individual customers can be possible as Oracle writes about it in their blog.

Financial Forecasting: Investors willing to utilise neural networks to make investments hope to gain a competitive advantage, one such company is TwoSigma.

Fraud Detection: Many companies handling any kind of transaction may have to deal with fraudulent transactions. To stay one step ahead, organisations can utilise neural networks to detect anomalies.

Awesome! We made it to the end!

Source: https://media0.giphy.com/media/fxtEUQIqolzxSaxVSF/giphy.gif

More examples related to computer vision in my prior two articles:

Further Resources:

Thanks for reading, hope you learnt something! Also ❤❤ Calvin & Venus.

Steven Vuong, Data Scientist

Open to comments, feedback and suggestions for the next article.
stevenvuong96@gmail.com
https://www.linkedin.com/in/steven-vuong/
https://github.com/StevenVuong/

Appendix:
S’more Calculus behind Forward and Back Propagation for the mathematical marshmallows

First, we have Forward Propagation. To explain this, better we can look at first how an individual neuron takes inputs and produces an output:

Source: https://pythonmachinelearning.pro/wp-content/uploads/2017/09/Single-Perceptron.png.webp

What we see above are the inputs being multiplied by the weights and summed with the bias. These are then passed through an activation function, denoted with sigma, to produce an output.

One commonly used activation function is the sigmoid function:

Source: https://miro.medium.com/max/970/1*Xu7B5y9gp0iL5ooBj7LtWw.png

An important feature is that it is differentiable at any point, meaning we can determine the slope between any two points along the z-axis.

Activation functions are required to introduce non-linearity to our model.

So for initial inputs x₁, our first layer outputs are the first layer weights multiplied by the input and summed with the bias before inserting it into our activation function.

And as our first layer output is the same as our second layer input, we can state:

Giving us our second layer output of:

And because our second layer is our final layer, we have:

Bing bang boom! Piecing this all together for our two-layer neural network, we have:

Which gives us our model output ŷ, hurrah!

Before we continue, gif break!

Source: https://media1.tenor.com/images/1e977363e546d3020a09062593852840/tenor.gif?itemid=5631613.. Five Guys please sponsor me

Recall from earlier that we use the sum-of-squared error as our loss function:

For backpropagation we use the chain rule to calculate the partial derivatives with respect to our weights and biases in order to adjust these accordingly. These allow us to determine the local minima of the cost function over multiple steps. For this we use gradient descent.

So as an example of determining the amount we update W₂ by, we can calculate:

And finally we can update the parameter W₂:

Where μ is how much we want to adjust W₂. This is known as the learning rate. After applying the same principles to b₂, then our first layer parameters W₁ and b₁, we have completed one whole training epoch!

How Machines ‘Learn’ — AI: Explainer and Examples

Opening up Neural Networks

Written by Steven Vuong