What in god’s name is Gradient Descent?

Published in

The Making Of… a Data Scientist

7 min readSep 14, 2018

This is one of the most questioned topics in Data Science interviews and one of the simplest methodologies to understand when starting to learn Machine Learning. Let’s finally understand what Gradient Descent is really all about!

According to Aurélien Géron’s book “Hands on Machine Learning with Scikit-Learn & TensorFlow” (great book, for who is starting):

Gradient Descent is a very generic optimization algorithm capable of finding optimal solutions to a wide range of problems. The general idea (…) is to tweak parameters iteratively in order to minimize a cost function.

Before we go further into explaining in more detail Gradient Descent, it is important to understand what is a Cost Function.

What is a Cost Function?

Let’s start this terminology by taking the simplest example of a Machine Learning algorithm: Linear Regression. Linear Regression is used to estimate linear relationships between continuous or/and categorical data and a continuous output variable.

y^ is the predicted value
n is the number of features
xi is the ith feature value

What in god’s name is Gradient Descent?

What is a Cost Function?

Written by azar_e