Understanding Gradient Descent in 1D and 2D in field of Deep Learning

Erhan Arslan
4 min readNov 26, 2023

--

Hi from another Deep Learning’s important subtopics.

In my another blog Understanding Gradient Descent , we tried to learn basics of Gradient Descent in 1D.

In this writing, let us delve in its details with 1D to 2D.

We tried to to explain Gradient Descent in 1D with an example. “Gradient descent is like trying to find the lowest point in a valley while blindfolded.”.
In 1D, imagine a curvy road with hills and valleys. You take small steps and check if going left or right brings you closer to the lowest point (valley). This decision is guided by how steep the road is at each step (derivative).
Let us give a real life example for 1D case;
Consider optimizing the fuel efficiency of a car. You’re tweaking a single parameter (like speed) to minimize fuel consumption. You adjust the speed slightly and observe how it affects efficiency until you find the best speed.

and now, let us combine real life example with Gradint Descent 1D code;
to remember, our gradient descent formula was;

here, α is the learning rate. you can customize this. be carefull with it.
xi​ is the current position, and df(xi​)​/d(x) is the derivative of f(x) at xi.
of course, no need to calculate or write code of finding derivative formulation. we can use various math fuction and tools for it.

import numpy as np

# this is just a sample. I think that if speed is x, (x-60)^2+50 is our equivalent.
# for gradient descent, also we have to take its derivative.
# note, we do not use fuel efficiency function. just a note.
def fuel_efficiency(speed):
return (speed - 60)**2 + 50

def efficiency_derivative(speed):
return 2 * (speed - 60)

# As we know from my other blog about Gradient Descent, here is basic algorithm.

def gradient_descent_1d(speed_init, learning_rate, epochs):
speed = speed_init
speed_values = [speed] # To store speed values for visualization
for _ in range(epochs):
gradient = efficiency_derivative(speed)
speed = speed - learning_rate * gradient
speed_values.append(speed) # Append speed for visualization
return speed_values

# I take a custom static initial start point as 70. you can change to try.
# or you can set a rondom value set with randn function.
initial_speed = 70
learning_rate = 0.1
num_epochs = 100
result_1d = gradient_descent_1d(initial_speed, learning_rate, num_epochs)

iterations = np.arange(num_epochs + 1)
plt.plot(iterations, result_1d, marker='o')
plt.xlabel('Iterations')
plt.ylabel('Speed')
plt.title('Gradient Descent: Speed Optimization')
plt.grid(True)
plt.show()
output of our gradient descent. Circle points are the output of every iterations.

Now, let us continue with 2D.

In fact, it is totally same . We talked about one parameter to optimize fuel efficiency. We consider with arranging speed to optimize fuel efficiency.
Now, in 2D, what if we want to consider with 2 parameter.

In 2D, the function becomes f(x,y) instead of f(x), and the update rule changes to:

In 1D, the derivative measures the rate of change along a single dimension, while in 2D, partial derivatives capture the rate of change concerning each variable independently while others are held constant.

Let us continue with same example and add another parameter to it.

I do not know exact formulation of mechanics for fuel efficiency, just random formulas :)

maybe we can add the engine load . I just write a random formula.
x is speed
y is engine load.
f(x,y)= (x-60)² + (y-30)² +50

I will not the partial derivatives of every parameters. It is easy to find everywhere. there are lots of online platforms which calculates partial derivatives. I used one of them and wrote the code.

Here is my python code sample for Gradient Descent for given f(x,y) .

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# my fuel efficiency function
def fuel_efficiency_2d(speed, engine_load):
return (speed - 60)**2 + (engine_load - 30)**2 + 50

def efficiency_derivative_speed(speed):
return 2 * (speed - 60)

def efficiency_derivative_load(engine_load):
return 2 * (engine_load - 30)

# 2D gradient descent
def gradient_descent_2d(speed_init, load_init, learning_rate, epochs):
speed = speed_init
load = load_init

speed_values = [speed] # lines for visualization
load_values = [load]# lines for visualization
efficiency_values = [fuel_efficiency_2d(speed, load)] # lines for visualization


for _ in range(epochs):
grad_speed = efficiency_derivative_speed(speed) #take speed partial derivative
grad_load = efficiency_derivative_load(load)# get engine load derivative
speed = speed - learning_rate * grad_speed # speed independent opt.
load = load - learning_rate * grad_load #load independent opt.
speed_values.append(speed)
load_values.append(load)
efficiency_values.append(fuel_efficiency_2d(speed, load))

return speed_values, load_values, efficiency_values

# you can select any other initial values.
initial_speed = 70
initial_load = 40
learning_rate = 0.1
num_epochs = 50
speed_vals, load_vals, efficiency_vals = gradient_descent_2d(initial_speed, initial_load, learning_rate, num_epochs)

# Plotting the 3D graph
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

speed_vals = np.array(speed_vals)
load_vals = np.array(load_vals)
efficiency_vals = np.array(efficiency_vals)

ax.plot(speed_vals, load_vals, efficiency_vals, marker='o')
ax.set_xlabel('Speed')
ax.set_ylabel('Engine Load')
ax.set_zlabel('Function Value (Fuel Efficiency)')
ax.set_title('Gradient Descent in 2D: Speed vs Engine Load for Fuel Efficiency')

plt.show()

Here is the output. when we look at the function lowest value;
when speed is around 60 and engine load is around 30 is better option for fuel efficiency.

I hope you like it. thank you for reading.

--

--