Linear Regression and it’s not so weird Assumptions

Prakash B
Nerd For Tech
Published in
3 min readJul 3, 2021

What is linear regression?

didn’t want to make the background look all white :P

It is a statistical model that tries to find a linear relation relationship between the input variables and the given output variable if such a relation exists.

Types of linear Regression

Broadly we can say that there are two classes, Simple and Multivariate linear regression.

Simple linear regression involves the usage of a single input variable to predict the output variable. eg:- predicting weight given height (but it might not be the perfect approach)

Multivariate linear regression involves the usage of multiple input variables to predict the output variable. eg:- predicting loan defaults, predicting house prices

Model

The general equation for linear regression involves

the equation for linear regression

where i represents the instance of the data point and 1,…,p represents the p different values that we use to predict y.

When we use matrix representation using the basic matrix scalar multiplication we can write it as x ^ T *B.

Solution

Now that we have the model in our hands, we need to find out the values of the betas (weights).

The cost function that we use for linear regression is generally SSE (sum of squares error). The other alternatives are RMSE, MAE (usage of cost function mainly depends on what our objective is).

We know that we eventually want values for the betas that would reduce the cost function.

To minimize is to find gradient and equate to 0

The cost function

As we said before we can represent it as a linear equation and in the form of a matrix product and hence we can solve it in multiple ways based on how we decide to depict the model.

  1. Normal equation

When we take the derivative and equate it to zero in the matrix form for the model we end getting beta as (x^T*x)^-1(x^T*y).

I know this seems kinda vague, fret not I’ll link you up to some good resources at the end.

2. Gradient Descent

The basic idea of gradient descent involves finding the gradient in each step and moving in the direction opposite to it in order to find the minima point.

There are different variants of gradient descent like stochastic and batch gradient descent.

Metrics

How do we know how well our model is performing?

We can use the raw RMSE values itself but the issue is that it isn’t bounded. So we make use of R² values.

Is it good enough?

It may not be the best, imagine you have a problem where you have around 100’s of input features so if we decide to do incremental feature addition to improve our model, R² value will keep on increasing despite the fact that the added feature doesn’t contribute in predicting the output variable.

Tadaa, we have the adjusted R² value which takes care of it.

Wait! When can you use linear regression?

sure there are a set of conditions that needs to be satisfied. But that would be another post!

Things to Sleep on

  1. What is the importance of bias in linear regression?
  2. When will we not be able to get a solution?
  3. Will the solution obtained be optimal all the time?

Resources ( to the stuff you probably felt going above your head)

--

--

Prakash B
Nerd For Tech

Doing MSc Data Science and everything around it