MULTIPLE LINEAR REGRESSION

Akshita Guru
4 min readMar 31, 2024

--

Hi there again! Now, we will talk about multiple linear regression features. Here is a link to our previous discussion of linear regression if you haven’t seen yet.

In the original version of linear regression, we had a single feature let’s say ‘x’ and the size of house and we were able to predict ‘y’, the price of the house. But now , what if we did not only have the size of the house as a feature which try to predict the price , but we also knew the number of bedrooms, floors and the age of home in years. It seems like this would give us a lot more information with which to predict the price.

Now have a look on above image, we are going to use notations x1,x2,x3 and x4to denote four features. We have taken more notations for simplicity listed in above image. Now for example, we have to find X superscript in parentheses 2, so it will be equal to [1416 3 2 40], it is also known as a row vector .To refer to a specific feature in the ith training example, we will write X superscript i, subscript j, so for example, X superscript 2 subscript 3 will be the value of the third feature, that is the number of floors in the second training example and so that’s going to be equal to 2.

Now that we have multiple features, let’s take a look at what a model would look like. Previously, this is how we defined the model, where X was a single feature, so a single number. But now with multiple features, we’re going to define it differently.

Now for example, let’s assume values of respective parameters as shown below.

If the model is trying to predict the price of the house in thousands of dollars, you can think of this b equals 80 as saying that the base price of a house starts off at maybe $80,000, assuming it has no size, no bedrooms, no floor and no age. You can think of this 0.1 as saying that maybe for every additional square foot, the price will increase by 0.1 $1,000 or by $100, because we’re saying that for each square foot, the price increases by 0.1, times $1,000, which is $100. Maybe for each additional bathroom, the price increases by $4,000 and for each additional floor the price may increase by $10,000 and for each additional year of the house’s age, the price may decrease by $2,000, because the parameter is negative 2.

In general, if you have n features, then the model will look like this.

Let’s define W as a list of numbers that list the parameter W1,W2…..Wn and this is a row vector.

Here again we are going to define X as a list , again a row vector that lists all of the features X1,X2,….,Xn.

Here ‘b’ is a single number and not a vector.

With this notation, the model can now be rewritten more succinctly as f of x equals, the vector w dot and this dot refers to a dot product from linear algebra of X the vector, plus the number b.

The name for this type of linear regression model with multiple input features is multiple linear regression. This is in contrast to univariate regression, which has just one feature.

In order to implement this, there’s a really neat trick called vectorization, which will make it much simpler to implement this and many other learning algorithms.

Await the next article to learn about vectorization and its magic.

I hope you found this summary of multiple linear regression to be interesting.

You can connect me on the following:

Linkedin | GitHub | Medium | email : akshitaguru16@gmail.com

--

--