Machine Learning 101 Linear regression from scratch

Dhruv Shrinet
3 min readJun 8, 2020

--

Implementation of Linear regression from sklearn is pretty damn easy, It’s just two lines of code but ever wondered how that really works?

By that I mean, How does the two theta value changes and their result is seen in the gradient descent and also the loss function? we will see the same in this story , I would recommend you to catch-up with linear regression so that nothing bounces off!

Y here is the predicted value , X being the input with M and B being the theta1 and theta2 respectively
Y here is the predicted value , X being the input with M and B being the theta1 and theta2 respectively

Let’s see how it can be written:

Theta being the list and the result being stored in pred

What about loss function? Here we have used Mean Squared Error:

Formula for the same

We ran a loop for all our data and calculated the loss for each theta we got from our main gradient descent function (Don’t worry its down)

What about theta1 and theta2 ? how will they change for them to change? We have to gradient and in this case we have to find the derivatives of the cost functions with respect to theta0 and theta 1 , in the snap below the two arrows denote the same:

Made a “grad” numpy array for storing of the values with and updating them in the loop after that we return the mean of them

Let’s check out the main function which will find us the both the theta’s on the basis of learning rate and will also store the loss for us

We get in return theta error list and and theta list let’s see what all can we do with that

Looks like our small program is working just fine. Watch that dip! we got to 190!

Looks like we found ourselves out global minima

Accuracy!!

Let’s plot the results, the prediction with our previous data and see how good our results are!

Scatter plotted the previous data with our new line which is the prediction value from our training
Calculating the accuracy with the r2_score

Do check out my next post on Multivariate regression

--

--