Statistics: A Branch of Mathematics or a Science?

Ben Denis Shaffer
4 min readJan 19, 2017

--

One day I was sitting at a Starbucks working on some problems for an assignment. An older woman sitting nearby politely asked me if I was a student and what I was studying. I happily replied that I am a graduate student studying statistics. She replied:

Ow statistics. So, mathematics!

Of course I wasn’t offended in the slightest but still I said:

Not really, but sure.

Being a graduate student of statistics makes it important for me to understand the nature of the subject to which I am devoting many hours of my life, daily. Perhaps I should have known what statistics is before perusing a master’s degree in it, but to me it is kind of a chicken and an egg situation. It has been a bit over two years since I decided that statistics was the discipline for me, and so maybe it’s a good moment to reflect on what I got myself into.

What I actually want to do is convince myself that statistics is not mathematics. Mainly I want to do this so that I don’t feel so bad when I make computational errors or when I search the internet for something I studied as an undergraduate student. At the same time, I can address what seems to be a very common misconception, namely that statistics is a branch of mathematics. And by the way, I find mathematics fascinating in and of itself. So really my interest is to understand and address the source of this misconception, especially because statistical thinking is rising to the forefront of an increasing number of dimensions of all of our lives.

Take Fitbit or Apple Watch for example. Do you take all those numbers of steps and calories burnt for granted, or are you critically evaluating them? What about the unemployment or the inflation rate? These are simple numbers that try their best to reflect the complex reality in a useful and efficient way. So, there you go, some motivation for why you should understand what statistics is, in the unlikely event that you are not a grad student of statistics like myself.

To me is seems that the real reason why this misconception arises is because no one really knows what mathematics is. This is not to sound derogatory in any way! I’ll bring up a couple of funny quotes to illustrate the point:

Young man, in mathematics you don’t understand things. You just get used to them. – John von Neumann

A mathematician is a blind man in a dark room looking for a black cat which isn’t there. - Charles Darwin

What these quotes illustrate is that mathematics is this abstract world that we can explore by entering it through our conscience. Numbers, points, lines, circles, as they are defined in mathematics don’t exists in the physical world as we experience it.

Statistics is something different entirely. What a statistician is concerned with are data. Data are a reflection of the real world, and what statistic does is takes that reflection/image of reality and looks at where and how it fits the abstract mathematical world, and more importantly tries to evaluate how certain you can be about this fit.

Imagine you have paper crumpled into a ball. You also have two imaginary shapes that you can extend to any size; a circle and a rectangle. Your task is to choose the best shape for each of the two real objects that you have. Best means that there is least open space between the surface of the paper ball and the imaginary object. Obviously you can put the ball in a box, but you know that a circle would fit better.

Statistics is similar except that our real objects are data and imaginary objects are models and distributions. This is why statistics is a science just like physics. Physicist make sense of the real physical world using mathematics, and Statisticians make sense of data using mathematics.

Take the Apple Watch for example. The watch shows you how many calories you burn during the day. Of course this number is not exact because the watch can’t actually count the real number of calories; it can only estimate it. It does it using data and a model/algorithm that is programed into the watch software. But how is the model/algorithm chosen? In the same way as you chose an imaginary circle for the real paper ball. They program a model that tries to given you an estimate that you can be most certain about. And if statistical/machine learning is used the algorithm is changing with time. I don’t think that that is the case for the Apple Watch but it is for something like Amazon Echo.

For this example statistics is the science behind how the best model/algorithm is chosen and how “best” is defined in this context.

--

--