Vectorized Implementation of Gradient Descent in Linear Regression

Vishesh Khandelwal
Analytics Vidhya
Published in
3 min readMay 7, 2020

Gradient Descent Algorithm :

This is the gradient descent algorithm to fine the optimal value of θ such that the cost function J(θ) is minimum.

Assume that the following values of X, y and θ are given:

m = number of training examples
n = number of features + 1

m = 5 (Total number of training examples)
n = 4 (Number of features+1)
X = m x n matrix
y = m x 1 vector
θ = n x 1 vector
xᶦ is the ith training example
xj is the jth feature in a given training example

Finding Errors using matrices:

h(x) = ([X] * [θ]) (m x 1 matrix of predicted values for our training set)
h(x)-y = ([X] * [θ] — [y]) (m x 1 matrix of Errors in our predictions)

whole objective of machine learning is to minimize Errors in predictions.

Errors matrix is m x 1 vector matrix as follows:

To calculate new value of θj, we find the summation of all errors (m rows) multiplied by jth feature value of the training set X.

That is, take all the values in E, individually multiply them with jth feature of the corresponding training example, and add them all together according to the gradient descent algorithm.

We can further simplify this to :

[E]’ x [X] will give us a row vector matrix, since E’ is 1 x m matrix and X is m x n matrix.

But we are interested in getting a column matrix, hence we transpose the resultant matrix.

Final Vectorized Equation:

--

--

Vishesh Khandelwal
Analytics Vidhya

IIIT Gwalior | Machine Learning | JavaScript Developer