Visualize the gradient descent of a cost function with its level circles -Python

Joséphine Picot
Analytics Vidhya
Published in
3 min readFeb 11, 2021

Hi! Here we will compute the gradient of an arbitrary cost function and display its evolution during gradient descent. All the code is available on my GitHub at this link.

Image by author

Installations

pip install numpy
pip install matplotlib

Imports

import numpy as np
import matplotlib.pyplot as plt
import math
from math import *

First, we define our cost function :

def f(x1, x2):
return 0.5*x1**2 + (5/2)*x2**2 - x1*x2 - 2*(x1 + x2)

We then manually compute the gradient of our function :

Image by author

We must define two functions :

  • The Gradient function that returns the result we calculated above,
  • The Norm function that will be useful to see how far we have traveled in each iteration of our gradient descent,
def gradient(x1, x2):
return np.array([-2 + x1 - x2, -2 - x1 + 5*x2])
def norm(matrice_1x2):
n_line = matrice_1x2.shape[0]
N = 0
for i in range(n_line):
N += matrice_1x2[i]**2
return math.sqrt(N)

We initialize our variables x1 and x2 with arbitrary values :

x1, x2 = 2, 1

We set a value for our step t (the bigger t is, the faster our algorithm converges but if t is too big, our algorithm may diverge, so be careful and test several values for the step).

t = 0.1

We also set a value for the “epsilon” threshold: we will stop the iteration as soon as the distance traveled during the gradient descent is less than the set threshold.

epsilon = pow(10,-6)

Then we assign the initial values for the rest of our variables :

grad_f = gradient(x1, x2)
n_grad = norm(grad_f)
i = 1
evolution_X1_X2 = [[x1, x2]]

Gradient Descent

while n_grad > epsilon:

direction = -grad_f
x1, x2 = x1 + t*direction[0], x2 + t*direction[1]
evolution_X1_X2 = np.vstack((evolution_X1_X2, [x1, x2]))
grad_f = gradient(x1, x2)
n_grad = norm(grad_f)
i +=1

We retrieve the evolution of our two variables X1 and X2 in the evolution_X1_X2 array :

evolution_X1 = evolution_X1_X2[:, 0]
evolution_X2 = evolution_X1_X2[:, 1]

Visualization 1

x1 = np.linspace(2, 3.5, 150)
x2 = np.linspace(0.25, 1.75, 150)
X1, X2 = np.meshgrid(x1, x2)
Z = f(X1, X2)
fig = plt.figure(figsize = (10,7))
contours = plt.contour(X1, X2, Z, 20)
plt.clabel(contours, inline = True, fontsize = 10)
plt.title("Evolution of the cost function during gradient descent with level circles", fontsize=15)
plt.plot(evolution_X1, evolution_X2)
plt.plot(evolution_X1, evolution_X2, '*', label = "Cost function")
plt.xlabel('x1', fontsize=11)
plt.ylabel('x2', fontsize=11)
plt.colorbar()
plt.legend(loc = "upper right")
plt.show()
Image by author

Another version more colorful — let's be crazy

x1, x2 = -25, -35
grad_f = gradient(x1, x2)
n_grad = norm(grad_f)
t = 0.1
epsilon = pow(10,-6)
n_grad = norm(grad_f)
i = 1
evolution_X1_X2 = [[x1, x2]]
while n_grad > epsilon:

direction = -grad_f
x1, x2 = x1 + t*direction[0], x2 + t*direction[1]
evolution_X1_X2 = np.vstack((evolution_X1_X2, [x1, x2]))
grad_f = gradient(x1, x2)
n_grad = norm(grad_f)
i +=1
evolution_X1 = evolution_X1_X2[:, 0]
evolution_X2 = evolution_X1_X2[:, 1]

Visualization 2

x1 = np.linspace(-30, 25, 150)
x2 = np.linspace(-40, 20, 150)
X1, X2 = np.meshgrid(x1, x2)
Z = f(X1, X2)
fig = plt.figure(figsize = (10,7))plt.imshow(Z, extent = [-30,25,-40,20], origin = 'lower', cmap = 'jet', alpha = 1)plt.title("Evolution of the cost function during gradient descent with level circles", fontsize=15)plt.plot(evolution_X1, evolution_X2)
plt.plot(evolution_X1, evolution_X2, '*', label = "Cost function")
plt.xlabel('x1', fontsize=11)
plt.ylabel('x2', fontsize=11)
plt.colorbar()
plt.legend(loc = "upper right")
plt.show()
Image by author

I hope you found this interesting!

--

--

Joséphine Picot
Analytics Vidhya

We can do so many things with data, we just have to find a project that thrills us!