Implementing Linear Regression From Scratch using Gradient Descent

Linear Regression in Python

Harshwardhan Jadhav
Everyday Analytics
Published in
6 min readJan 17, 2021

--

https://www.warriortrading.com/linear-regression-definition-day-trading-terminology/

Hello everyone, This article is going to be a little mathematical and code-oriented. Actually, this is an extension of my previous article in which I have discussed the basic theoretical understanding of Linear Regression. I will recommend you to read that for getting out the most from this current article. CLICK HERE for reading the basics of linear regression. Also, read about Gradient Descent HERE, because we are going to use that in this article.

Okay, now let’s start.

In the last article, we have seen what is linear regression, what are the terms used, and also we have seen a small example for practical understanding of it. In this article, we are going to see how we can implement our own linear regression model and use it for our purpose of prediction.

Without wasting time, let’s jump right to the simple linear regression first,

Firstly, let’s define our loss function to measure the performance of the model. We will be using Least Square as the loss function, which is used in general and very basic one and also so simple to understand.

Linear Regression Equation
Loss Function
Equation of slope
Equation of intercept

Above, I have mentioned both our regression formula and also the loss function. We have seen much more about the regression equation in the last blog, the new thing here is the loss function. The Least square method is just the square of the difference between the True value (yi) and the predicted value(pi). A very common question must have come to your mind that, why are we squaring the difference? Well, the answer is simple: if we use (yi-pi) directly we may get the negative value for some data points, but we don’t want a negative value. Squaring the difference will just remove the negative sign thus we use it. There is no deep secret than this.

Okay, let’s look at the code, how we implement it in python,

# Building the model
X_mean = np.mean(X) # mean of all X values in dataset
Y_mean = np.mean(Y) # mean of all the Y values in dataset
# Initialize values of numerator and denominator
# for calculating the value of slope
num = 0
den = 0
#loop over all the datapoints and find the sums
for i in range(len(X)):
num += (X[i] - X_mean)*(Y[i] - Y_mean)
den += (X[i] - X_mean)**2
# calulate the slope
m = num / den
# calculate the intercept
c = Y_mean - m*X_mean

In this above code, we found the value of ‘m’ and ‘c’ which we will use to predict the values by using them in the equation ‘mx+c’.

Now, let’s see how we use those values for predictions,

#loop over all the datapoints and find the sums 
predictions = []
# for each data point predict the value of y
for i in range(len(X)):
Y_pred = np.dot(m, X[i]) + c # prediction stage
predictions.append(Y_pred)
# At the end of this loop we will have all predicted values of y in
list 'predictions'

Simple and easy to grasp concept, right?

But you must have noticed we did not use the loss function anywhere in this code, as this was a very basic implementation we will consider the loss function right now. Yes, we will now implement a little advance linear regression model whose name I have included in the name of this blog also i.e. we are going to use Gradient Descent for implementation.

The equations are the same as above but we are going to use them in a different way here, as we know we find the gradient or slope ‘m’ for and the intercept term ‘c’ which generalizes the equation “y=mx+c”. In gradient descent, we try to find the slope or gradient by iteratively calculating the derivative of the loss function, and our objective is to get the minimum loss.

We find the derivative of the loss function w.r.t. the parameters, in our equation ‘mx+c’ we have two parameters ‘m’ and ‘c’ so we have to find the derivatives ‘Dm and ‘Dc’. Let’s see how

Loss Function

Here, ‘Pi is the predicted value which is ‘mx+c’.

Derivative with respect to m,

Derivative with respect to c,

If you don’t know how to calculate derivatives, don’t worry there is always a way in this world of AI. Here is one very nice website where you can calculate the derivative of any function you desire derivative-calculator.net. Everyone is scared of derivatives but they are not that scary if we use this calculator.

Now we have all the prerequisites for implementing the code, let’s dive in,

# Intialize the parameters
m = 0
c = 0
LR = 0.0001 # The learning Rate
epochs = 100 # The number of iterations
n = float(len(X)) # Total number of elements in X
# Performing Gradient Descent Optimization
for i in range(epochs):
sum1 = 0
sum2 = 0
for i in range(len(X):
Y_pred = np.dot(m, X[i]) + c # predict value of Y
sum1 += X[i] * (Y[i] - Y_pred)
sum2 += Y[i] - Y_pred
D_m = (-2/n) * sum1 # Gradient wrt m
D_c = (-2/n) * sum2 # Gradient wrt c
m = m - LR * D_m # Update m
c = c - LR * D_c # Update c
# we will have optimum values of m and c finally
# Using those values we will predict the values of y using mx+c
predictions = []
for i in range(len(X):
Y_pred = np.dot(m, X[i]) + c # predict value of Y
predictions.append(Y_pred) # append predicted value to list

Done! We have successfully implemented our own Linear Regression model from scratch. But we don’t have to write this code every time we want to use it, people have developed libraries for us and one of them is scikit-learn. The most popular library used for whole machine learning. Let’s see the Sklearn variant of linear regression.

#install the library
pip install -U scikit-learn
# after installing import it >>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
>>> # y = 1 * x_0 + 2 * x_1 + 3
>>> y = np.dot(X, np.array([1, 2])) + 3
>>> reg = LinearRegression()
>>> reg.fit(X, y)
>>> reg.score(X, y)
1.0
>>> reg.coef_
array([1., 2.])
>>> reg.intercept_
3.0000...
>>> reg.predict(np.array([[3, 5]]))
array([16.])

You can see sklearn saved many lines of code and thus saved our efforts to write it from scratch.

So that’s all, I hope you enjoyed this article, if yes then give it a clap and share with your friends who are learning machine learning. In the next article, we will solve a real-world regression problem. See you soon. Thank you.

Connect me on LinkedIn:

Reference:

https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

--

--

Harshwardhan Jadhav
Harshwardhan Jadhav

Written by Harshwardhan Jadhav

Data Scientist | Mechanical Engineer

No responses yet