Decoding:

Multi linear Regression

Calculations by hand

Nishigandha Sharma

Published in

Analytics Vidhya

4 min readAug 12, 2020

Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. These variables can be both categorical and numerical in nature.

Please note: The categorical value should be converted to ordinal scale or nominal assigning weights to each group of the category. The formula will consider the weights assigned to each category.

Multiple regression is an extension of linear regression that uses just one explanatory variable. The resultant is also a line equation however the variables contributing are now from many dimensions. Multiple linear regression is also a base model for polynomial models using degree 2, 3 or more.

If you want to understand the computation of linear regression. Check out the article here.

Decoding : Simple Linear Regression

Formulae & Calculations

medium.com

𝑦 = b₀ + b₁X₁ + ⋯ + bᵣXᵣ + 𝜀.

b0 — constant/ y-intercept

b1 , b2- coefficients for each variable

X1, X2 — predictors

𝜀 — Error rate — This is small negligible value also known as epsilon value. For this calculation, we will not consider the error rate

We take the below dummy data for calculation purposes:

Here X1 & X2 are the X predictors and y is the dependent variable. From the above given formula of the multi linear line, we need to calculate b0, b1 and b2 . Lets look at the formula for b0 first.

b0 = ȳ — b1* x̄1 — b2* x̄2

As you can see to calculate b0, we need to first calculate b1 and b2. Lets look at the formulae:

b1 = (Σx2_sq) (Σx1 y) — (Σ x1 x2) (Σx2 y) / (Σx1_sq) (Σx2_sq) — (Σ x1 x2)**2

b2 = (Σx1_sq) (Σx2 y) — (Σ x1 x2) (Σx1 y) / (Σx1_sq) (Σx2_sq) — (Σ x1 x2)**2

Now this definitely looks like a terrifying formula, but if you look closely the denominator is the same for both b1 and b2 and the numerator is a cross product of the 2 variables x1 and x2 along with y.

What is noteworthy is that the values of x1 and x2 here are not the same as our predictor X1 and X2 it’s a computed value of the predictor. Before we find b1 and b2, we will compute the values for the following for both x1 and x2 so that we can compute b1 and b2 followed by b0:

· (Σxi_sq)

· (Σxi y)

· (Σ x1 x2)

Here ‘i’ stands for the value of x say variable 1 or variable 2 and N is the number of records which is 10 in this case. Now we can look at the formulae for each of the variables needed to compute the coefficients.

(Σxi_sq) = (ΣXi2) — (ΣXi)**2/ N

(Σxi y) = (ΣXi y) — ((ΣXi) (Σy) )/ N

(Σ x1 x2) = (Σ x1 x2) — ((ΣX1) (ΣX2) ) / N

Looks like again we have 3 petrifying formulae, but do not worry, let’s take 1 step at a time and compute the needed values in the table itself.