The Depths of Statistics

What would happen if statistics were built around absolutes instead of squares?

Yogesh Singla
y.reflections
3 min readOct 12, 2023

--

If you have been through a statistics college course, you would be familiar with terms such as measures of central tendency (mean, median, mode) and measures of spread (variance, standard deviation, R, R-squared etc) and then sampling populations, proportions and distributions like normal, Baysian, Poisson etc. Those courses might have ended with some test statistics and a touch of machine learning algorithms. I do not intend to explain these concepts here. There are numerous courses and videos available online to do so.

In this post, I want to step back and ask a fundamental question:

Why did we build the statistics around the squares and not absolutes?

Motivation

A statistics course starts with explaining some robust measures like median ,inter-quartile range and mean absolute errors but then takes the path of using squared error terms and calculating variance and standard deviation. The rest of the statistical tools are built on top of variance and standard deviation.

Did you wonder what happened to the median and IQR? Or the mean absolute error?

This is what led to me to a internet ‘worm-hole’ from one website to another and even long discussions with Chat-GPT to find some answers until the Eureka moment! But before we jump into that, here are some unsatisfactory answers from the internet.

Bad Explanations of why we use squares…

The most common answer I found was that we use squared terms instead of absolutes because —

  1. It makes the maths easier; Or
  2. This has been a historical precedent.

Just because something is difficult or done a certain way should in no way deter us from thinking differently. I was looking to build a more deeper intuition for why things are done the way they are instead of just accepting this most common answer.

My Intuition

Using squared error terms gives us a spread in-terms of variance which is not in the same units as the data. The standard deviation gives a measure of spread in the units of the data but it is neither robust nor exact since the outlier terms have been disproportionately accounted due to squares. A more exact measure would be the mean of absolute errors. This makes much more sense when we think geometrically. Each variable offers us a new dimension to think about and the dataset expands.

If geometry is approximation of real world, we know that going from one block to another across the town would not be a straight line. But, geometric approximation would use squares and square root. Imagine, if all distances and maths was based on Manhattan distances and not root of squares, things would have become more ‘practical’ and directly applicable but much difficult.

Similarly, we sure can build a statistical model with mean absolute error terms instead of root mean squares. Build new definitions of variance and standard deviation and take into account outliers. This would be Robust Statistics. It is a field of statistics, that uses estimators instead of variance and standard deviation. It is applicable to fields such as finance in estimating stock markets (which are not normal distributions), economic models, engineering in specific applications where we do not want to approximate errors but calculate exact error terms.

End-Note

This was just an exercise to build intuition. I understand there are still nuances and this comparison will not hold to mathematical rigour. But it helped me understand why we use squares more than absolutes. And how we should be aware of the assumptions and limitations of these models.

--

--