Uncertainty

Published in

CodeParticles

7 min readFeb 16, 2018

The further we get with computing — algorithms, perception, data collection - the harder the problems we are trying to solve and the less certainty we can have regarding their solutions. Logic is the study of truth and probability is the study of what we don’t know for certain. We try to model a system that is so complex that we don’t truly know what’s going to happen with 100% certainty. The best we can do is see it as a black box: measure the inputs and outputs and get as close as we can to be able to predict the outputs based on the inputs.

Weather is a classic example. With this complex system, meteorologists take data about similar conditions and build a predictive scenario — 95% of times when the weather looked like this today, it rained tomorrow (this will yield a screaming weatherman on TV and thousands of people buying umbrellas). But we don’t have enough data (weather conditions over the years) or knowledge of how to model the system to be able to be any more accurate than that. But we’re getting better. Better systems of measurement (doppler, etc.) along with more data being collected.

Modeling

We have a few certainties in the universe, or what we might call ‘Laws.’ Experiment and observation have caused them to be considered unassailable — when x happens, y will follow. Gravity is one, and there aren’t many more. Aside from these laws of the universe, most of our world is completely uncertain. This is why we can send a man to the moon, but we can’t figure out how our wife/husband is going to react when we buy them a present. The laws of nature are relatively simple and fixed. They’re predictable. The brain is much harder to model, as are many other systems which we call complex systems:

Complex systems are systems whose behavior is intrinsically difficult to model due to the dependencies, relationships, or interactions between their parts or between a given system and its environment.

Human behavior, weather, the stock market, and who is going to win the Super Bowl are all like this. Societies (what will happen if I kill Franz Ferdinand), cities, traffic, the economy and quantum physics are all like this. None of these things is certain. They don’t follow rules (or at least rules we can understand).

When the world’s richest tycoons want to make another billion dollars, they can easily game the system in order to do so. In contrast, if they felt inclined to reduce global inequality or stop global warming, even they wouldn’t be able to do it, because the system is far too complex. — Yuval Noah Harari, Homo Deus: A Brief History of Tomorrow, 2015

Intelligence is easier to model than creativity. Happiness is hard to measure, as is the economy. In The Growth Delusion, author David Pilling discusses our use of GDP as a model for the economy:

Invented in the U.S. in the 1930’s, the figure is a child of the manufacturing age — good at measuring physical production but not the services that dominate the modern economies. — David Pilling, The Growth Delusion, 2018

He suggests that numbers do not reflect ‘people’s lived experience’ and that relying on averages and aggregates hide the nuances of inequality. It is a flawed measure.

Others suggest using a happiness index to gauge the success of a country as opposed to GDP. The United Nations is taking on this task with its World Happiness Report. The United States ranks 14th in their study. But we are 1st in GDP.

Our bodies are complex — consider medicine and health, where we currently have to rely on bad data. When you go the doctor, they often ask you how you feel, where it hurts, but none of that is quantifiable. With better sensors we will be able to know where it hurts. Doctors can ask how your new medicine is working on your depression, but with brain scans we’d be able to actually know. This is why data is important.

Data

Computers were basically invented to collect data and make models with the purpose of making predictions. Initially, these were models of the physical, known world — navigation, astronomy, demographics and the census.

A model is a guess at reality. We aren’t certain about something, but here is how we think it’s going to go based on what we’ve seen so far. We have to model using data. The understanding that “when x happens, y will follow” is based on lots and lots of observations. Batting Average, SAT Scores, GDP, Gini Index, quarterback rating, credit scores are all guesses at what’s going to happen in a certain situation.

This is why data is so important — the more data and observations we have (law of large numbers), the more we can trust our models. We can look for correlations to understand what is really going on. If we had complete weather data for the past 100 years using today’s technology, weathermen would be a lot less hated.

But there are many systems that are just too unpredictable, no matter how many data points we gather — quantum theory and stochastic processes in physics, for example. We still have no idea where a particle dropped in a glass of water is going(Brownian Motion). It’s like trying to predict what route a dove is going to take when you suddenly release it from your hands.

Probability

We use probability to try to get to the truth. Consider the root of the word:

The word probability derives from the Latin probabilitas, which can also mean “probity”, a measure of the authority of a witness in a legal case in Europe, and often correlated with the witness’s nobility.

‘A measure of the authority of the witness’ — this speaks to the nature of probability as a way to use reputation to make good decisions.

After we created more instruments of measurement (beginning in the 1600s), we suddenly had more data. Insurance companies had more data on when people were likely to die and could price policies accordingly. Casinos could use reams of mathematical data to understand that in large numbers, they would always come out on top.

Note the win probabilities on the left (to the right of the cards)

Black Swan

Modeling can be dangerous and wrong, depending on the data being used. In addition, things that aren’t probable DO happen. If an infinite number of people flipped a coin 100 times in a row, someone eventually is going to get 100 heads in a row. Unlikely things happen.

In addition, people are horrible at using statistics in their everyday lives. We see patterns where none exist and are fooled by statistics all the time. We give weight to things incorrectly. In Fooled by Randomness, Nassim Nicholas Taleb discusses the role of Survivorship Bias, where we give massive credit to winners while forgetting all the losers, despite much of their success relying on luck. Think of a coin flipping contest — after each flip there will be increasingly few survivors. Then at the end, we will interview that person and ask how they did it.

This is why there are rules around Mutual Fund advertising. Rule 482 — it’s like bragging that you got 10 coin flips in a row. There’s a chance Warren Buffet is just extremely lucky. He may be the guy who has flipped 100 in a row. Many people made a lot of money off the 2008 financial crash and suddenly were given tons of money because they were a ‘genius’, when in many cases, it was an example of luck and their funds afterwards have often been unsuccessful.

The 2008 financial crash is another good example of probability gone wrong. The models being used were based on historical data from a period of incredible economic growth and the world in 2005 just wasn’t the same so the model didn’t apply well. The credit score companies used to judge people to whom they lent money to buy homes was a faulty prediction of their actual ability.

And then there are events, or so called Black Swans, as Nassim Nicholas Taleb calls them, that are extremely unpredictable and wildly world changing. 9/11 is a good example. No one in the stock market had this event built into their model, but it changed the world. Taleb argues that in hindsight, we try and rationalize why these events happened in order to comfortably continue to use models that don’t account for these types of events.

Algorithms

So what the hell does this have to do with technology? A great deal of the algorithms that are going to drive our future are based on probability.

As Pedro Domingos states in The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World:

These seemingly magical technologies work because, at its core, machine learning is about prediction: predicting what we want, the results of our actions, how to achieve our goals, how the world will change. — Pedro Domingos, The Master Algorithm, 2015

When Watson competed on jeopardy, they actually showed the computer saying ‘based on what I believe, there is a 98% chance this is the correct answer.’

Netflix has its ‘match score’, baseball predictions have ‘win probabilities’ based on simulations, and the algorithms that play chess, Go and many other games have win probabilities for each side after each move. Self driving cars, facial, image and voice recognition and many other algorithms behind Artificial Intelligence rely on a probability threshold where, once hit, a decision can be made.