Essential Python for Machine Learning: AutoGrad
The Effortless Differentiation Magician
This is the fifth chapter of my ebook.
Introduction
In the realm of machine learning, the ability to efficiently compute gradients is fundamental to training and optimizing models. Autograd, a powerful Python library, stands ready to streamline this process, automating the calculation of derivatives for a wide range of mathematical expressions. This blog post delves into the world of Autograd, exploring its core concepts, functionality, and practical applications through a hands-on example.
What is Autograd?
- Automatic Differentiation: Autograd’s primary role is to automatically compute derivatives of Python and NumPy code. It achieves this through a technique known as automatic differentiation (AD), eliminating the need for manual derivation of complex expressions.
- Reverse-Mode Differentiation: Autograd excels in reverse-mode differentiation (also known as backpropagation), which is particularly efficient for computing gradients of scalar-valued functions with respect to array-valued arguments. This makes it ideal for optimization tasks in machine learning.
- Flexibility and Versatility: Autograd can handle a broad range of Python language features, including loops, conditional statements, recursion, and closures, offering remarkable adaptability in various coding scenarios.
How Autograd Works
- Graph Construction: When a function is defined, Autograd constructs a computational graph that captures the sequence of operations involved. This graph represents the function as a series of interconnected nodes, each representing an elementary operation.
- Gradient Computation: During forward propagation, Autograd not only computes the function’s output but also records the intermediate values at each node in the graph. This information is then used in reverse-mode differentiation to efficiently backpropagate gradients through the graph, computing the derivatives of the output with respect to the input variables.
Example Code with a Linear Regression Problem
The code is available in this colab notebook.
import autograd.numpy as np
from autograd import grad
# Define the linear regression model
def model(X, w, b):
return np.dot(X, w) + b
# Define the loss function (Mean Squared Error)
def loss(params, X, y):
return np.mean(np.square(model(X, params[0], params[1]) - y))
# Create a gradient function for the loss function
loss_grad = grad(loss)
# Generate sample data using a list of integers
n = 10
X = np.arange(n).reshape(-1, 1) # Reshape to 10x1 vector
noise = np.random.randn(n, 1) # Add some noise
y = 2 * X + 1 + noise
print('Num of data points:', n)
print('X shape:', X.shape)
print('noise shape:', noise.shape)
print('y shape:', y.shape)
# Initialize weights and biases randomly
w = np.random.rand(1, 1)
b = np.random.rand(1, 1)
# Perform gradient descent
learning_rate = 0.01
for i in range(100):
grad_w, grad_b = loss_grad([w, b], X, y)
print(f'i: {i}, w={w}, loss={loss([w, b], X, y)}, grad_w={grad_w}, grad_b={grad_b}')
w -= learning_rate * grad_w
b -= learning_rate * grad_b
# Print the learned weights and biases
print("Learned w:", w)
print("Learned b:", b)
# Make predictions on the same data
predictions = model(X, w, b)
# Compare predicted and real values
print("Predicted values:", predictions)
print("Actual values:", y)
# Visualize the comparison (optional)
import matplotlib.pyplot as plt
plt.plot(X, y, 'o', label='Actual data')
plt.plot(X, predictions, '-x', label='Predictions')
plt.legend()
plt.show()
Output:
...
Learned w: [[2.05026434]]
Learned b: [[0.91621226]]
Predicted values: [[ 0.91621226]
[ 2.9664766 ]
[ 5.01674094]
[ 7.06700528]
[ 9.11726963]
[11.16753397]
[13.21779831]
[15.26806266]
[17.318327 ]
[19.36859134]]
Actual values: [[ 1.47353745]
[ 3.68543065]
[ 5.43154084]
[ 5.36883116]
[10.36817663]
[13.33707709]
[14.02194099]
[13.01281091]
[16.65654291]
[19.77728432]]
Conclusion
Autograd automates the calculation of derivatives, saving time and effort in complex ML tasks. It leverages reverse-mode differentiation for efficient gradient computation. It supports a wide range of Python features, offering flexibility in code design. It’s a valuable tool for optimization and training in machine learning.