An introduction to linear models

Gianluca Malato
Your Data Teacher
Published in
5 min readMar 29, 2021

--

Photo by Michael Dziedzic on Unsplash

Linear models are some of the simplest models in machine learning. They are very powerful and, sometimes, they are really able to avoid overfitting and give us nice information about feature importance.

Let’s see how they work.

Basic concepts of linear models

All regression linear models share the concept to model the target variable as a linear combination of the input features.

The a coefficients are estimated minimizing some cost function.

Linear combination is very simple and is a very common model in nature. Starting from this approach, we can build several different models.

Linear regression

When you have to face a regression problem, Linear regression is always the first choice. This linear model estimates the coefficients minimizing the Mean Squared Error cost function.

This cost function is very simple and the solution of the optimization problem can even be found analytically (although numerical approximations are preferred).

The great problem of linear regression is that it’s sensitive to collinearity, that is the correlation between the features. In fact, if we consider the variance of the prediction, we get:

--

--

Gianluca Malato
Your Data Teacher

Theoretical Physicists, Data Scientist and fiction author. I teach Data Science, statistics and SQL on YourDataTeacher.com. E-mail: gianluca@gianlucamalato.it