Backpropagation — Algorithm that tells “How A Neural Network Learns”
In this article, we will learn about →
- Why we need backpropagation?
- What is backpropagation?
- How does backpropagation work?
So when you learn about Neural Network you must have heard about the words Backpropagation and Feed-Forward. But, if you don’t know the exact mechanism of backpropagation and how internally it works? You have arrived at the right place.
Backpropagation: Backpropagation is a supervised learning algorithm, that tells ‘How a neural network learns or how to train a Multi-layer Perceptrons (Artificial Neural Networks). But, some of you might be wondering why we need to train a Neural Network.
Why we need Backpropagation?
While designing a Neural Network, in the beginning, you feed some weights with some random values.
So, it’s not necessary that whatever weight you have initialized will be correct, or it fits our model the best. Let’s see the image given below →
So, you can see in the above image that our model output is way different than our actual output that means the error value is huge.
Now, the question arises that How will you reduce the error?
So, to reduce the error we need to somehow explain the model to change the parameters(i.e weights and biases), such that error becomes minimum.
Or
you can say, you have to train your model. one way to train your model is called Backpropagation.
Now, the next question is How will you train your model? So, for training your network there are following steps that you need:-
Explanation of steps : -
- Doing a feed-forward operation.
- Calculate the error — Comparing the output of the model with the desired output.
- Error minimum? — Check whether the error is minimum or not.
If an error is a minimum stop, Else go backward.
- Backward propagation of errors — Running the feed-forward operation backward (backpropagation).
- Update the parameters — If the error is huge then, update the parameters (weights and biases).
- Repeat — Repeat the process until the error becomes minimum.
Once the error becomes minimum, you can feed some inputs to your model and it will produce the correct output.
Now hope you understand that Why we need Backpropagation And What is the meaning of training a model?
Now is the right time to understand “What is Backpropagation and How it works.
What is Backpropagation?
Backpropagation gives us detailed insights into how changing the weights and biases to change the overall behavior of the network.
The term backpropagation is an abbreviation for “Backward propagation of errors”. Let’s understand by an Example :
Note: -
This example is added to only understand the theoretical concept of backpropagation not for the practical concept.
In your day to day life many times you enter a captcha for accessing any site. But sometimes it happens that at first attempt you recognize the wrong captcha.
See the captcha given below and Just think for a moment what is written in the given captcha -
have you recognized correct?
Let’s match your answer with the given answer -
Sure at first attempt maybe you were not able to recognize correct, So what you did?
Hope, to correct the mistake you gone back and analyzed it and tried to find where you did a mistake(i.e error) after correcting that wrong word you again tried to give the correct answer.
for better understanding see the given image -
So now you can say to give the accurate result your brain go back and analyze it after updating some parameters bright up only those neurons whose having the higher accuracy to minimize the mistake(i.e. error).
This concept is called “Backpropagation”.
I hope now you understand the theoretical concept of backpropagation. The same concept we apply to train our neural network.
The Backpropagation algorithm looks for the minimum value of the error function in weight space using a technique called gradient descent.
How does backpropagation work?
For a better understanding of how the backpropagation algorithm works first, you have to understand the -
- The architecture of the Neural Network.
- Then the concept of feed-forward or forward pass.
- and the error function.
1. The architecture of the Neural Network
In the given neural network there are three types of layers :
- Input Layer
- Hidden Layer
- Output Layer
- Input layer — input layer consists of n neurons namely x1, x2, x3, ………, x_n.
2. Hidden layer — first hidden layer consist of n-1 neurons namely h21, h22, h23, ……., h_n-1 and second hidden layer consist of two neurons namely h31 and h32.
3. Output layer — output layer having output O1, O2,……, On-2.
4. And initialize some random weight W1, W2,………., W_n.
5. The input layer consists of bias b1.
The input layer is connected to the hidden layer and the hidden layer is connected to the output layer through interconnection weights.
2. Feed-Forward
Take a input vector(unit vector) namely x, then apply the first weight matrix namely W1(x) and sigmoid function to get the values in second layer, then apply second weight matrix namely W2(x) and another sigmoid function to get the values in third layer and so on, Do this until we get our final prediction. This is the “feed-forward process” that the neural network uses to obtain prediction from the input vector.
3. Error Function
Error function talks that — How badly a point gets misclassified.
Just as before, neural networks will produce an error function, which in the end, is what we’ll be minimizing.
Now, hope you know -
- How neural network architecture looks like,
- What is the feed-forward process,
- And what is the error function?.
So, let’s understand the training algorithm
Training Algorithm
The training algorithm of backpropagation involves four stages.-
Step 1: Initialization of weights — some small random values are assigned.
Step 2: Feed-forward
Step 3: Backpropagation of errors
Step 4: Updation of the weights and biases.
1. Initialization of weights
Step 1: Initialize weight for each neuron’s weight should be some small random values. →
2.Feed-Forward
After this process, you get ŷ which is a predicted value of your model, then subtract it with actual output and get an error.
if the error is minimum then stop. if not then go in a 3rd step which is backpropagation of errors.
3. Backpropagation of errors
In backpropagation, we find the gradient of the error function →
Here to calculate all derivatives we will use the chain rule.
Before understanding the backpropagation of the error you should have a good knowledge of chain rule.
So, before going further just have a recap of “chain rule”.
Note: We are using the chain rule to find the contribution of all weights in each neuron.
So, let’s calculate the gradient of the error function with the help of backpropagation →
For calculating the gradient of the error →
first, we have to calculate all partial derivative of E for all weights →
And for calculating the partial derivative of E for all weights, we have to calculate these all→
Let’s calculate, So with the help of h →
we can calculate →
So in the same way you can calculate all partial derivative and put all in the equation of gradient of the error.
4. Updation of the weights and biases.
Now update all wights and bias →
So now if you want to play with code just download a code of backpropagation and feed-forward from the given link below →
Thank you for the read. I hope you enjoyed this post and learned something new and useful and also understood the working of backpropagation. If I missed anything, Let me know in the comments. If you like it, please hold the clap button and share it with your friends! :)