Introducing Gradient Descent : minimizing the cost function

Coursera ML-W1(Parameter learning)

Gradient descent is a basic optimization algorithm, popularly used for minimize the cost functions in machine learning algorithms,here it is explained with it’s functionality and why we need this. my explanation is based on whatever i learned on coursera (Machine Learning).

Why we need this

In machine learning we use various algorithms.Here i am taking supervised learning as example,let we have a set of labeled data(input,output) for processing the task. Now we wants to fit the given data-sets for regression(find finite output value) / classification (classifying the data in multiple classes) respectively. For example predicting the price of a car based on it’s features(regression) or predicting the models of cars based on their image(classification). now we have our model representation function based on which we derives our cost function. now we have to find the minimum value of cost function( J(theta0,theta1)) for this value we use Gradient Descent algorithm.

How it works

Now we have our cost function,first we calculates a derivative(slope) and now we start to decrease our function with a learning rate and decrease it until we got an optimal value.

theta0 and theta1 are two points which update themselves with learning rate alpha(ML-W1)

there are following procedure to choose learning rate:

  1. Don’t choose too smaller learning rate it will make our model slow.
  2. Don’t choose large value of learning rate ,it will make such fast the iteration that our model can skip the optimal point.

So gradient descent is actually an optimization algorithm which helps to our algorithm to find the best fit for model. now we summarize this such as,

  • let’s start with some value of theta0 and theta1.
  • keep changing them to reduce our cost function, by this procedure we can reach on an optimal value.

i didn’t embed too much mathematics above,so in next story we’ll implement Gradient descent.


Machine Learning