# Multiple Linear Regression from Scratch using Python

In the previous post, you learned how to implement a simple linear regression from scratch using only NumPy. In today’s post, I will show how to implement a multiple linear regression from scratch also using only NumPy.

# Multiple Linear Regression

In the simple linear regression, we want to predict the dependent variable ‘y’ using only one explanatory variable ‘x’ like the equation below.

y= ax + b

‘y’ is the dependent variable, ‘x’ is the explanatory variable, ‘a’ is the slope of the line, and ‘b’ is the intercept variable in other words the value of ‘y’ when ‘x’ is zero. On a linear regression, we want to find the values of ‘a’ and ‘b’ that minimizes the prediction errors.

Multiple Linear Regression is an extension of linear regression used when you have more than one explanatory variable to predict the dependent variable.

Where for i=n observations:

- Y = Is the dependent Variable.
- Xs = Are the explanatory variables.
*ß*0 = Is the y-intercept (constant term).- The others
*ß a*re the slope coefficients for each explanatory variable. - The
*ß are also called weights.*

If you pay attention, the linear regression is a simple version of the multiple regression where all the *ß* terms from *ß2* to *ßp* are zero.

As an example, let’s assume that you want to sell your car and want to estimate how much your car is worth it. You know that factors such as model year, horsepower, and mileage influence the car price. In that case, you could create a multiple linear regression like the one below.

## But how can I know what are the best values of the Betas?

This part it’s similar to the simple linear regression. We want to minimize the cost function using the Gradient Descent Technique. If you don’t know what those terms are you can learn them in my medium post.

From my previous post, you know that the cost function is the function below.

And our goal was to find the values of ‘a’ and ‘b’ that minimizes the value of the cost function. The derivatives from the simple linear regression where:

For the multiple linear regression, the process is the same, but now we add an *X*0 = 1 to the equation so we could generalize the derivate of the cost function. So the multiple linear regression formula became:

The derivative of this function is

To update the weights, we just need to multiply the derivative by a learning rate and subtract from the previous weights.

It’s important that we simultaneously update all *ß*.

This is an iterative process, could we make it more efficient by using matrices?

# Vectorized Multiple Linear Regression

In Python, we can use vectorization to implement the multiple linear regression and the gradient descent. We can transform the ys, ßs, and Xs into matrices like the image below.

With that image we can have the predicted y using the formula below:

The derivative with the formula below:

Finally, to get the updated weights we have the equation below:

# Let’s write the code

The first piece of advice for the people that are learning Data Science but do not have a software engineering background. Always document your code.

So we start with a function called fit_linear_regression that will receive the Xs, Ys, learning rate and, epsilon. Epsilon works as a threshold, we will stop when the error is less than epsilon.

- In Step 1 we insert a column containing 1 to be the y-intercept into the x NumPy array.
- In Step 2 we initialize the ßs, here I am calling weights. The weights will be a NumPy array containing the number of variables in X.
- In Step 3 we will update the weights until the norma of the partial derivative is less than the epsilon.
- In Step 3.1 we get the predicted values like in figure 9 and the partial derivative like in figure 10.
- In Step 3.2 we get the norma.
- In step 3.3 we update the weights like in figure 11.
- The if in lines 41 and 42 is to warn us when we put a high learning rate and the functions diverged.
- The return of the function is the adjusted weights.

Now that we have the correct weights, how do we predict values?

# Making predictions

To make predictions we just need to take the dot product between the weights array excluding the last value that is the y-intercept and the transposed Xs values after that get this result and sum it with the y-intercept.

Now we implemented our multiple linear regression from scratch, but how its compare with the sklearn?

# Comparing with sklearn

The first is the Mean Squared Error from the sklearn model and the second is the MSE from our function.

As we could see, they are similar. Our function MSE is just 0.004 greater than the sklearn.

# Conclusions

In this post, you have learned

- What is multiple linear regression.
- How we can fit a multiple linear regression model.
- A vectorized multiple linear regression formula.
- How to implement your own multiple linear regression using only Python and NumPy.

You can see the code used to write this post in this Colab notebook.

If you like what you read be sure to 👏 it below, share it with your friends and follow me to not miss this series of posts.