“All models are wrong, but some are useful”
This quote is attributed to George E. P. Box, and 3 days ago actually marks the 3 year anniversary of his passing. He was a British Mathematician and Professor of Statistics…
It’s a brilliant insight, that speaks to the reality of where we are at with machine learning and algorithmic development. I by no means, am anywhere near the sort of expert that he was or many of the leading thinkers in the field are but I do find it fascinating and highly valuable.
All Models Are Wrong
Essentially, this breaks down the fact that any model, no matter how accurate is still some type of simplification or generalization of something. It’s not possible, at least so far to have one that is absolutely perfect. Maybe one day in the near future we will have those and that will be amazing but even with all the computing power and data it is not possible yet to have it be perfect.
So if they are all wrong why do we make them? Well, turns out there is tons of data out there and we as humans, with just our brain can’t process all of it and remember it, so computers and models become quite helpful.
Even if they are all wrong, they are far superior to our efforts to process and understand it all, we wouldn’t be able to make sense of it on our own.
Directly from Professor Box’s 1978 paper:
“Now it would be very remarkable if any system existing in the real world could be exactly represented by any simple model. However, cunningly chosen parsimonious models often do provide remarkably useful approximations. For example, the law PV = RT relating pressure P, volume V and temperature T of an “ideal” gas via a constant R is not exactly true for any real gas, but it frequently provides a useful approximation and furthermore its structure is informative since it springs from a physical view of the behavior of gas molecules.
For such a model there is no need to ask the question “Is the model true?”. If “truth” is to be the “whole truth” the answer must be “No”. The only question of interest is “Is the model illuminating and useful?”.
Some Are Useful
Finding the ones that are useful is the trick, and it’s well worth the effort. Think if you can have some type of model or algorithm that explains just a bit more, that’s unknown knowledge you never would have had without the model. So although an approximation, the close ones, and the good ones lead to faster growth and development.
Box repeated multiple times:
“Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.”
I’m sure many scientists and engineers don’t agree but I like the truth and reality behind such a simple thought. You may as well develop a model to represent what you are seeing and see if you can predict how it will continue and iterate on it until the “wrongness” supersedes the “usefulness”
My goal is to develop simple models that allow quicker analysis of key data, and they may be mere approximations at first to begin but tuning that machine can yield in faster progress for us, and hopefully others.
Build some models, who knows you may even invent something that could change the world.