Simple Linear Regression Implementation From Scratch

Part 4/5 in Linear Regression

Pratik Shukla
5 min readMay 15, 2020

Part 1 : Linear Regression From Scratch.

Part 2 : Linear Regression Line Through Brute Force.

Part 3 : Linear Regression Complete Derivation.

Part 4 : Simple Linear Regression Implementation From Scratch.

Part 5 : Simple Linear Regression Implementation Using Scikit-Learn.

In the last article we derived a formula to calculate the “best fit” regression line. Now it’s time to implement it using python. Keep in mind that in this article we are not going to use python libraries such as “scikit-learn” to find the parameters such as slope and intercept of line, but instead we will implement Simple Linear Regression from basic python code and user defined functions. Here we’ll utilize the formula we derived in the last article to find slope and intercept. We’ll use python libraries such as “matplotlib” to visualize the data and the regression line, but if you don’t want to visualize the data and just want to find the regression line you can skip the code where we use matplotlib to visualize the data.

Credits : Unsplash

Goal : To predict the CO2-emission of a new car.

Let’s Code 🌄

(1) Importing the required libraries :

Import Libraries

(2) Read the csv file :

There are more columns in our data but due to limited space I can only show a few here.

Read CSV File

(3) Find out the columns in our data :

Columns

(4) Find additional information about our data :

Dataset Info

(5) Print various statistical data of our dataset :

Analysis of Data

(6) Select useful features from our dataset :

Feature Selection

(7) Plot the data with it’s value count :

Histogram
Histogram

(8) Plot the data on scatter plot to find out which feature can be used to make the predictions.

Plotted Data
Plotted Data
Plotted Data

Here we can see that we can easily plot a regression line in ENGINE SIZE VS CO2 EMISSION plot.

(9) Now we will divide our dataset into 2 parts. One for training data and another for testing data. We’ll use 80% of the data for training and 20% of data to test our predictions.

Divide Data

(10) Finding the mean of CO2-EMISSION :

Average

(11) Main function to find slope and intercept. Go check out my last article to understand the derivation of formula used here.

Main Function

(12) Testing our function with basic data :

Testing our Function

Voila! It works perfectly!!

(13) Finding the Slope and Intercept for our actual data :

Finding parameters

(14) Now that we have our Slope and Intercept with us we can make our regression line :

Regression Line

(15) Plot the regression line to visualize it :

Data Visualization

(16) Now we’ll predict the values with our model. But first we need to make a function for that :

Prediction

(17) Can we predict the engine-size from co2-emission? Of course!! Here’s how to it.

Reverse Prediction

Now it’s time to check how well our model performed in predicting the testing values. There are many methods to calculate the error/accuracy of a model. Here we’ll cover a few of them

(1) Residual_Sum_of_Squares :

RSS Accuracy

(2) R-Squared :

R² Accuracy

(3) Mean_Absolute_Error (MAE):

MAE Accuracy

(4) Mean_Squared_Error(MSE):

MSE Accuracy

(5) Mean_Absolute_Percentage_Error(MAPE):

MAPE Accuracy

(6) Mean_Percentage_Error (MPE) :

MPE Accuracy

In summary, in this article we saw how we can implement simple linear regression without scikit-learn. It’s a lot of work right? But wait..!! There is an easy way to perform the same calculations with same output using some python libraries. In the next article we’ll see how we can perform such complex calculations in minutes with scikit-learn.

In my future articles I will try to show which accuracy model is best for different kind of datasets.

To find more such detailed explanation, visit my blog

You can follow me on medium.

Watch video tutorials of machine learning algorithms here.

You can download the code and some handwritten notes on the derivation from here.

If you have any additional questions, feel free to contact me.

Thank You!

--

--