Machine Learning Multivariate Regression From scratch(Python)

Published in

The Startup

4 min readJun 11, 2020

Regression with more than 1 Feature is called Multivariate and is almost the same as Linear just a bit of modification

In my previous post I talked about linear regression from scratch in python,Go check out if you have not Click here

In this we will see how can we implement Multivariate Regression , I would really appreciate if you can brush-up a bit about multivariate regression.If you are looking for some tutorial I would recommend andrew-ng Check this out , a good description as how the multivariate regression is different from linear regression as how the formula changes.

As by now I guess you guys must have known that multivariate regression has more than 1 feature compared to linear regression

and the same way hypothesis formula also changes, the number of thetas will be n+1(here they are called as b)

Let’s begin with the coding part and will explain side by side , the dataset used is the diabetes from sklearn library , you can use many more like boston housing just needs to prediction dataset for this case and have more than 1 feature

Data Loading

It’s pretty much self explanatory , just loaded the dataset from sklearn ,Let’s see how our data looks like , we got 10 features

looks like our data is already normalized

Though dataset is quite small , but lets see what we can do with that

Some more data processing

We also need to add 1’s to our X so that dimension of X is equal to the dimension of thetas for example X has m rows (length of the dataset) and it has n columns , but we know we need to add 1 more feature so that X(m,n+1)==thetas(n+1) for that we may use numpy

our new dataset with one extra column at the start with all ones

Hypothesis Function

just like linear regression the formula is same , its just that the thetas and features have increased so we gonna do them with a loop which iterates feature wise,here each value of x(single feature at i’th time) is multiplied by i’th theta and we add that to our result , here rows is actually the n features

Loss Function

we gonna use the same loss (Mean Squared error) and actually the code is also same as in linear regression,Here X is the matrix (442x10) we pass the whole matrix

Derivative of cost function Jø

the highlighted part, we need to find that

as we know we need to find thetas for n values which are the number of features so we need two nested loops one for the features and the other to iterate all over the dataset row wise

we take a mean on gradients by dividing it by total rows

Gradient Descent Function for Training

Now we need to find the thetas all of them (n+1),also lets find the total loss.Here we also need two nested loops one for how many epochs and the other one for looping through all the thetas and adjusting there values with the formula

In return you get theta’s and a loss list

OUTCOME

Let’s see the loss and plot it via matplotlib,we see a big loss maybe because the data is very less to train with and the features increase the complexity

The r2_score

for r2_score we first need the predictions and we can get that with the help of thetas and prediction function we created

Further Optimization

The accuracy is pretty low and its because of the data we have which is just 442 , for better we may need a bigger dataset and also we can play with hyper-parameters(learning rate,times), but for now we got almost 60%

Here as you can see we have use nested loops in each function almost ,to iterate over theta and the rows which increases the time complexity of the program and for that we may use broadcasting in python for like doing .dot() product and using .sum() which may do the work in O(1).