The Pitfalls of Linear Regression and How to Avoid Them

What to Do When the Linear Regression Assumptions Don’t Hold

Genevieve Hayes, PhD

Published in

Analytics Vidhya

12 min readSep 28, 2019

You can always spot a data science newbie by the speed with which they jump to fitting a neural network.

Neural networks are cool and can do awesome things that, for many of us (myself included), are the reason why we got into data science in the first place. I mean, who goes into data science to play around with daggy old linear regression models?

Yet, the irony of the situation is that, unless you are working in a specialist field, like computer vision or natural language processing, a lot of the time, simple models, like linear regression, actually provide a better solution to your problem than complex black box models, like neural networks and support vector machines.

After all, linear regression models are:

fast to train and query;
not prone to overfitting and make efficient use of data, so can be applied to relatively small datasets; and
are easy to explain, even to people from a non-technical background.

I’ve heard senior data scientists, with experience working with cutting edge AI, sing the praises of linear regression for these very reasons.

The Pitfalls of Linear Regression and How to Avoid Them

What to Do When the Linear Regression Assumptions Don’t Hold

Written by Genevieve Hayes, PhD