# What actually is least square error in linear regression?

## TL;DR: minimizing error by maximizing log likelihood

Sep 15, 2018 · 2 min read

Least sqaure error is used as a cost function in linear regression. However, why should one choose sqaure error, instead of absolute error, or other choices? There’s a simple proof that can show that least sqaure error is a reasonable and natural choice.

Assume the target variable and inputs are related as below:

We would like to minimize the error by maximising the log likelihood. The likelihood function is:

Minimizing the log likelihood function

, **which is also known as the least sqaure function**, and note that the σ² is irrelavent in this case.

Note that the least-sqaure method corresponds to the maximum likelihood estimation. Hence, one can justify the least-sqaure method, with the natural assumption of ϵ ∼ N(μ,σ²)