From Neurons to Knowledge: Understanding Forward Pass and Backpropagation Step by Step

6 min readNov 15, 2023

Artificial neural networks have become a cornerstone of modern technology, powering everything from self-driving cars to recommendation systems. These networks are inspired by the human brain, with layers of interconnected neurons processing information.

While there are numerous articles and tutorials available on this topic, here, I’m excited to share my unique perspective and efforts.

In this Article, we’ll dive into the core concepts of neural networks, specifically focusing on the step-by-step processes of the forward pass and backpropagation. By the end of this journey, you’ll have a solid grasp of how neural networks work.

The Basics of Neural Networks

Before we delve into the mechanics of the forward pass and backpropagation, let’s establish a foundation. Neural networks consist of layers of neurons, each connected to the next. These neurons process information and learn patterns from data. The first layer is the input layer, the middle layers are the hidden layers, and the last layer is the output layer.

Architecture of Neural Network(Image by UpGrad)

Inside the Neuron: The Building Blocks of Decision-Making

Before we dive into the intricacies of the forward pass and backpropagation, it’s essential to demystify what happens inside each neuron. Neurons are the fundamental units of a neural network, responsible for processing information and making decisions.

In a neural network, various components work together to process and learn from data:

Inputs: These are the features of our data, serving as the initial information fed into the network.
Weights: Weights are coefficients that are multiplied with the inputs. The primary objective of the network is to find the optimal weights that lead to accurate predictions.
Linear Weighted Sum: This term represents the aggregation of products between inputs and their corresponding weights, further adjusted by adding a bias or offset term, denoted as ‘b.’
Hidden Layer: Within the hidden layer, multiple neurons are situated, each responsible for learning different patterns in the data. The superscript signifies the layer, while the subscript denotes the specific neuron or perceptron within that layer.
Edges/Arrows: These correspond to the weights connecting the various inputs, whether they are the features or the outputs from the hidden layer. For clarity, they may be omitted in visual representations.
Activation Function: The activation function introduces non-linearity into the network’s processing. It determines whether a neuron should activate based on the weighted sum. Common activation functions include sigmoid, ReLU, and SoftMax.
Prediction: The neural network uses these components to make predictions based on the learned patterns and optimal weights.

These components collectively enable a neural network to process, learn from, and make predictions based on the provided data.

In the following sections, we’ll delve into the mechanics of the forward pass and backpropagation, unveiling these processes step by step.

The Forward Pass

The forward pass in a neural network is the process of taking input data, multiplying it by weights, applying activation functions, and passing it through the network’s layers to generate predictions or outputs. It represents the flow of information from the input layer to the output layer, making it the first step in data processing and decision-making.

In this article, we will manually perform the forward pass. In practice, neural networks are typically implemented using frameworks like PyTorch or TensorFlow. However, conducting the forward pass manually will provide valuable insights into the underlying process.

Example Neural Network

Here’s the basic structure with initial values:

for input and Output layer

We denote values of Weighted Sum as:

After Activation Function as:

Input layer to Hidden layer

Hidden Layer to Output layer

As we can easily see that Predicted value of both output and Target values are differed, i.e.

Predicted value

Target Value

Now, we have to calculate the error for each output neuron using the Square Error function and sum them to get total error:

The Backward Pass

Our goal with backpropagation is to update the weights in the network so that they cause the actual output to be closer the target output, there by minimizing the error for each output.

Process of Backpropagation. Image by Author

In the backward pass, gradients are calculated using the chain rule of calculus, starting from the output layer and propagating backward through the network. These gradients represent the sensitivity of the error to each parameter. The network’s weights and biases are then updated in the opposite direction of the gradient using optimization algorithms, such as gradient descent, to minimize the error and enhance the network’s performance.

So, we using chain rule in each layer.

Outer layer

Consider W7, we want to know how much change in W7 affects total error.

Similarly, we will calculate for W8, W9, W10.So for calculation, Some required derivatives are:

Now , we know all the derivatives, so values are:

if we want to change in Bias i.e. b2, then;

Hidden layer

Next, we have to continue backward pass by calculating new values for W1, W2, W3, W4, W5, W6 and b1(if ) similarly.

So, for applying chain rule, we require some derivatives:

Now, by applying chain rule, change in w1,w2,w3,w4,w4,w6,b1 effect total error are:

if we want to change in Bias i.e. b1, then;

Now, we have all the error derivatives and we are ready to updates parameter after first iteration of backpropagation. Use Learning Rate, α=0.01.

Finally, we update all of our weighs and bias also. This is only first round of backpropagation, we can do this process multiple times un till error will minimized.