The Pitfalls of Linear Regression and How to Avoid Them

What to Do When the Linear Regression Assumptions Don’t Hold

Genevieve Hayes, PhD
Analytics Vidhya
Published in
12 min readSep 28, 2019

--

You can always spot a data science newbie by the speed with which they jump to fitting a neural network.

Neural networks are cool and can do awesome things that, for many of us (myself included), are the reason why we got into data science in the first place. I mean, who goes into data science to play around with daggy old linear regression models?

Yet, the irony of the situation is that, unless you are working in a specialist field, like computer vision or natural language processing, a lot of the time, simple models, like linear regression, actually provide a better solution to your problem than complex black box models, like neural networks and support vector machines.

After all, linear regression models are:

  • fast to train and query;
  • not prone to overfitting and make efficient use of data, so can be applied to relatively small datasets; and
  • are easy to explain, even to people from a non-technical background.

I’ve heard senior data scientists, with experience working with cutting edge AI, sing the praises of linear regression for these very reasons.

--

--

Genevieve Hayes, PhD
Analytics Vidhya

Data scientist and educator with a PhD in Statistics — Helping data professionals maximize the value of data without expensive tools — www.genevievehayes.com.