100 days of data science and AI Meditation (Day 11- Backpropagation Training)

Farzana huq
6 min readJul 27, 2023

--

This is part of my data science and AI marathon, and I will write about what I have studied and implemented in academia and work every single day.

Imagine you have a magical machine (a neural network) that can learn to do specific tasks, like recognizing objects in images, translating languages, or predicting stock prices. However, this machine doesn’t know anything initially; it needs training.

Training this magical machine is like teaching a child. You show the machine many examples of the task you want it to learn. For example, if you want it to recognize cats, you’ll show it pictures of cats and tell it, “This is a cat.” The machine learns from these examples and tries to make predictions on its own.

But how does the machine know if its predictions are correct? That’s where backpropagation comes in. It’s like a teacher who corrects the child’s mistakes. After the machine makes predictions, you compare them to the correct answers you already know (the target). If the machine’s prediction is wrong, you tell it how wrong it was and in which direction it should adjust itself.

Backpropagation is like a magic spell that helps the machine learn from its mistakes. It figures out how much the machine’s predictions need to change to get closer to the correct answers. With each correction, the machine becomes better and better at making accurate predictions.

This process repeats many times, with the machine looking at more examples and adjusting itself based on the mistakes it makes. Over time, the machine becomes really good at the task you taught it. And just like a child becomes better with practice, this magical machine becomes a powerful tool that can do amazing things like understanding language, making recommendations, or even driving cars autonomously.

Backpropagation, short for “backward propagation of errors,” is a fundamental algorithm used in training artificial neural networks. It is a key component of many machine learning models, including deep learning architectures. The purpose of backpropagation is to update the model’s parameters (weights and biases) to minimize the difference between the predicted outputs and the actual targets.

Here’s a simplified explanation of how backpropagation works:

  1. Forward Pass: During the forward pass, the input data is fed through the neural network layer by layer. Each layer performs a linear transformation (using weights and biases) followed by an activation function, producing output values.
  2. Loss Calculation: The output values generated during the forward pass are compared to the actual target values, and a loss function (e.g., mean squared error or cross-entropy) is used to quantify the difference between the predicted and actual outputs.
  3. Backward Pass: In the backward pass, the algorithm works to minimize the loss by adjusting the network’s parameters. It starts by calculating the gradient of the loss function with respect to each model parameter (weight and bias).
  4. Gradient Descent: The gradients calculated in the backward pass indicate the direction and magnitude of the steepest decrease in the loss function. The model parameters are then updated in the opposite direction of the gradients to minimize the loss.
  5. Iterative Process: Steps 1 to 4 are repeated multiple times (known as epochs) to iteratively update the model parameters and refine the model’s predictions. With each iteration, the model gets closer to producing accurate predictions.

Let’s have a look at a simple Python project that implements a neural network with backpropagation for the XOR problem. The XOR problem is a classic problem in which the neural network is trained to predict the output of the XOR gate.

First, make sure you have the NumPy library installed. If you don’t have it, you can install it using pip install numpy.

Define the sigmoid activation function and its derivative:

The sigmoid activation function is used to introduce non-linearity in the neural network. It maps any input value to a value between 0 and 1. The derivative of the sigmoid function is used in backpropagation to calculate the gradients.

Next step is to define the Neural Network class:

The NeuralNetwork class represents our neural network. In the constructor (__init__), we initialize the network's architecture and randomize the weights.

Implement forward propagation:

The forward_propagation method takes input data X and computes the output of the neural network by passing it through the hidden layer and the output layer. We use the dot product of the input and weights, followed by the sigmoid activation function.

Implement backpropagation:

The backpropagation method is used to update the weights of the neural network based on the difference between the predicted output (output) and the actual output (y). We calculate the gradients and update the weights for both the output layer and the hidden layer.

Train the neural network:

The train method is used to train the neural network. It iterates over the specified number of epochs and updates the weights through forward propagation and backpropagation.

Make predictions using the trained neural network:

The predict method takes input data X and returns the output of the neural network, which represents the predictions.

Set up the XOR problem inputs and outputs:

We define the input data X and the corresponding output y for the XOR problem.

In final stage we Create a neural network instance, train it, and make predictions:

We create an instance of the NeuralNetwork class, train it with the XOR data for 10000 epochs, and then print the predictions for the XOR inputs.

Backpropagation allows the neural network to “learn” from the training data and improve its ability to make accurate predictions by continuously adjusting the model’s parameters based on the gradients. The process of forward pass, loss calculation, backward pass, and gradient descent is repeated until the model converges to a state where the loss is minimized, and the model performs well on unseen data (generalizes).

Backpropagation is a powerful algorithm that has enabled the success of many complex machine learning models, especially in the field of deep learning. It is an essential tool in training neural networks and has contributed to significant advancements in various domains, including natural language processing, computer vision, and speech recognition, among others.

Reference:

[1] Goodfellow, Bengio & Courville 2016, p. 200, “Furthermore, back-propagation is often misunderstood as being specific to multi-layer neural networks, but in principle it can compute derivatives of any function”.

[2] Tan, Hong Hui; Lim, King Han (2019). “Review of second-order optimization techniques in artificial neural networks backpropagation”. IOP Conference Series: Materials Science and Engineering. 495 (1): 012003. Bibcode:2019MS&E..495a2003T. doi:10.1088/1757–899X/495/1/012003. S2CID 208124487.

If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5/month, giving you unlimited access to thousands of stories on Medium, written by thousands of writers. If you sign up using my link https://medium.com/@fhuqtheta, I’ll earn a small commission.

--

--

Farzana huq

My area of interest are Insurance and tech , Deep learning, Natural language processing, Data mining, Machine learning, algorithmic trading, quantum computing.