Machine Learning Algorithms from Scratch (Part — I)

(Deep Dive into Ordinary Least Squares Method)

Published in

Accredian

7 min readJul 29, 2022

Introduction

Mathematics and Programming are the two main ingredients that go into data science that every data practitioner needs to master to excel in this highly competitive field. But learning mathematics and practicing coding is more than what meets the eye.

That is why we have started this series, Machine Learning algorithms, from scratch. We will take you through the ambiguous forest of ML by breaking down each algorithm into its bare minimum mathematical concepts and NumPy-only implementations. By doing this, you will be able to learn mathematics and practice programming that is both concise and relevant to data science.

What is Linear Regression?

Linear regression is the most straightforward machine learning algorithm to develop a relationship between the independent variable (X) and a dependent variable (Y).

Linear Regression Model

As it is the simplest, it is also easier to understand and implement. There are two ways to implement linear regression “Least Squares Method” and the “Gradient Descent Method,” and we will look at the Least Squares Method in this article.

Starting with the Least Squares Method

Least Square regression is a deterministic model, which means that, unlike other stochastic models, the output or the calculated weights does not depend on the algorithm’s state; instead, they solely depend on the input data. And hence no iterations are required. We get the closed-form solution in one go.

The objective of Least Squares is to try and find the line that “best fits” the dataset, a kind of line that when imposed onto the set of data points given as inputs will have the lowest possible error.

In the method of Ordinary Least Squares, we try to fit a straight line onto the data points by minimizing the squared difference between the predicted value and the observed value of a given dependent variable.

Finding the Error:

So, we have a dependent variable (x) and an independent variable (y), and the model aims to calculate a “line” that will have the least amount of error possible when compared to the data. But first, we need to have a way to calculate error so that we can work on minimizing it.

An error is simply the difference between the actual and predicted value. So, the error term should look like this:

The Error Term for a single Prediction

But this is the error of a single data point; our dataset will have multiple instances, and there will be a separate error term for each instance. Therefore, we can add up all the losses and find the loss of the whole model.

Simple error for the whole model

Now, look at this image of a linear regression line

Now, you can realize that not all the errors will be positive. You can observe that some data points are above the line, and some are below the line. This represents that the model overestimated and underestimated some of the predictions.

The data points above the regression line will have positive error values. In contrast, the data points below will have negative errors. And because of this, we cannot simply add all error terms; we need to first convert every value to positive. This can be done by squaring each error term before addition.

The Final Error Term

The Final Error for Linear Regression

Minimizing the Error

The only thing left to do is minimize the cost function (Error) now that it has been identified.

To do that, first, we take the final error term:

Expand the predicted value:

Now we need to differentiate the Error term with respect to the slope (m) as well as with respect to the intercept (c). And then, by equating the resulting equation to zero, we can find an expression for both slope and intercept.

The Final output comes out as this

The equation for the Slope of the Least Square model

The equation for the Intercept of the Least Square model

The symbol (ȳ) and (x̅) represent the mean of Y and X features, respectively.

The value “m” is the Slope, and “c” is the Intercept of the regression line. By establishing these two values, we have completed our regression model.

Ordinary Least Squares Implementation in Python

We will create a python class for least squares similar to the ML model we import from the scikit-learn library because it will help us learn and practice Object Oriented Programming (OOP), which is always a plus.

Generate Random Regression Data

First, let us generate random regression data to test our model. This can be done by using the scikit-learn library.

Importing Required Libraries

As promised, we will only use NumPy to create our model from scratch.

import numpy as np

Initiating the Class

Now, let us start defining the class piece by piece. First, we will initiate a class named MyLeastSqaures and create one attribute for Slope and intercept each, and we will also set both of them as zero to keep them from having any garbage value.

class MyLeastSqaures:
 
  def __init__(self):
 
    self.intercept = 0
    self.slope = 0

Define a Method to Calculate the Slope

We have a direct formula to calculate the Slope. All we have to do is implement it in python and set up a method inside our class.

def CalculateSlope(self, X, y):
    
    numerator = 0
    denominator = 0

    #Calculate Mean
    mean_X = np.mean(X)
    mean_y = np.mean(y)
 
    #Looping through
    for i in range(len(X)):
      numerator += (X[i] - mean_X) * (y[i] - mean_y)
      denominator += (X[i] - mean_X)**2
 
    slope = numerator/denominator
    slope = round(slope[0], 3)
    return slope

First, we initiated the numerator and denominator as zero. We then calculated the mean for both X and Y using the NumPy mean() function. Then we loop through each observation in X and calculate the numerator and denominator for the slope equation. After that, we divide the numerator and denominator and round our Slope to 3 decimal places.

Define a Method to Calculate the Intercept

Now, we will do the same thing with the formula for Intercept.

def CalculateIntercept(self, slope, X, y):

    #Calculate Mean
    mean_X = np.mean(X)
    mean_y = np.mean(y)
    
    intercept = mean_y - slope * mean_X
    intercept = round(intercept, 3)
    return intercept

Define the Fit Method

Similar to scikit-learn we need to add a fit method to call these functions and calculate Slope and intercept.

def fit(self, X, y):
 
    self.slope = self.CalculateSlope(X, y)
    self.intercept = self.CalculateIntercept(self.slope, X, y)

This method is pretty straightforward. Just call the functions and store the values.

Define the Predict Method

Now, we need a “predict” method to calculate the predictions.

def predict(self, X):

    return self.slope * X + self.intercept

Putting Everything Together

How to use our Model

Calculated Slope and Intercept

The value of calculated slope: 96.071
The value of calculated intercept: 7.381

Visualize the Calculated Least Square Line

Conclusion

That is it for this article, You have learned to create your own least square model, Congratulation, you are now one algorithm closer to mastering Machine Learning.
Next, we will look at how to create a Gradient Descent Linear Regression From Scratch. So stay tuned if you’re interested in demystifying ML once and for all.
Follow me for more upcoming articles related to Data Science, Machine Learning, and Artificial Intelligence.

Also, Do give me a Clap👏 if you find this article useful, as your encouragement catalyzes inspiration for and helps me to create more cool stuff like this.

Final Thoughts and Closing Comments

There are some vital points many people fail to understand while they pursue their Data Science or AI journey. If you are one of them and looking for a way to counterbalance these cons, check out the certification programs provided by INSAID on their website. If you liked this story, I recommend you to go with the Global Certificate in Data Science & AI because this one will cover your foundations, machine learning algorithms, and deep neural networks (basic to advance).