Vectorized Implementation of Gradient Descent in Linear Regression

Vishesh Khandelwal

Published in

Analytics Vidhya

3 min readMay 7, 2020

Gradient Descent Algorithm :

This is the gradient descent algorithm to fine the optimal value of θ such that the cost function J(θ) is minimum.

Assume that the following values of X, y and θ are given:

m = number of training examples
n = number of features + 1

m = 5 (Total number of training examples)
n = 4 (Number of features+1)
X = m x n matrix
y = m x 1 vector
θ = n x 1 vector
xᶦ is the ith training example
xj is the jth feature in a given training example

Finding Errors using matrices:

h(x) = ([X] * [θ]) (m x 1 matrix of predicted values for our training set)
h(x)-y = ([X] * [θ] — [y]) (m x 1 matrix of Errors in our predictions)

whole objective of machine learning is to minimize Errors in predictions.

Errors matrix is m x 1 vector matrix as follows:

To calculate new value of θj, we find the summation of all errors (m rows) multiplied by jth feature value of the training set X.

That is, take all the values in E, individually multiply them with jth feature of the corresponding training example, and add them all together according to the gradient descent algorithm.