What does the term “Linear” in linear regression mean ?

Biswajit Pattnaik ❤️🙏
2 min readApr 13, 2020

--

Linear regression is the first machine learning or statistical modelling technique,everyone learns in classes & it has got universal application in various fields.whenever,anyone wants to describe the relationship between a continuous response variable & one or more continuous or categorical independent or regressor variable,he does a regression first with a scatterplot for Simple Linear regression & then with different multiple linear regression statistical packages for multiple linear regression.Individual scatterplot between a response variable & continuous independent variable shows you the nature of relationship between the two variables i.e linear or polynomial or anyother higher order relationship is there.But evenif the scatterplot curve is not a straight line or linear,we still use linear regression for curve fitting & have you ever thought about it ??At times,we do some transformation of variables or add higher order terms of the independent variable to define the functional relationship but still call it linear.so what is the explanation for it ??

So linearity has basically two explanations.one is linear in variable & the other is linear in parameters.The linearity assumption in linear regression means the model is linear in parameters (i.e coefficients of variables) & may or may not be linear in variables.Confused ??

Look at the below equations :

Y =a + bx — — -(1)

Is linear in both variable & parameter as both coefficient “b” & variable “X" has highest power 1 in the equation.

Y= a+bx+cx^2 — — -(2)

Is linear in parameters but not linear in variable because we have highest power of X is 2 here

Y=a+(b^2)X — — — — (3)

Is linear in variable but not in parameter as parameter b is of order 2.

Y =a +(b^2)X+cx^2 — — — — (4)

Is neither linear in parameter nor linear in variable as both parameter & variable have order 2.

So basically,we do linear regression for equations 1 & 2 types where the model is linear in parameters eventhough it may or may not be linear in variables.

When we talk of linearity in linear regression,we mean linearity in parameters.So evenif the relationship between response variable & independent variable is not a straight line but a curve,we can still fit the relationship through linear regression using higher order variables.

Even Y = e^(a+bx)

Is linear regression because if we take log of both sides then it becomes

Log Y = a+bx which is linear regression.

So whenever the model is not linear in parameters,we use non linear regression.

While some models may look intimidating & non linear at first glance,they can be transformed to linear model by suitable transformations.

The famous Cobb-Douglas production function

Y = a(x^b)(z^c) which may look non linear at first sight can still be converted into linear model by taking log of both sides of the equation.

It transforms to LogY= Log a + b Log X + c Log z which is linear.

So next time,someone asks you what do you mean by linearity in linear regression,hope you explain it well.

--

--