Conquer These 5 Statistical Mistakes and Make Better Decisions
Recognizing Our Biases Has Never Been More Important
“If it disagrees with experiment, it’s wrong. In that simple statement is the key to science,” said Richard Feynman, concisely capturing the scientific method and our best-known process for decision-making.
Unfortunately, this isn’t the view of science that most people hold. For too many, science meant rote memorization and a series of overly complicated equations. Neither of which inspire a lot of excitement in the subject.
But science is much more than understanding the mechanics. It’s a process — a way of learning. And one that we all do every day. In the words of Carl Sagan,
“Science is a way of thinking much more than it is a body of knowledge.”
Science is assessing new information as it relates to our existing beliefs. Then developing new (and hopefully improved) beliefs based on this information. All the while becoming more or less certain of our previous positions.
It’s a process that helps anyone who’s looking to make more informed decisions. And it’s a process that could be in for some significant changes.
A Millennia-Old Decision Model
“Now I’m going to discuss how we would look for a new law. In general, we look for a new law by the following process. First, we guess it, no, don’t laugh, that’s the truth. Then we compute the consequences of the guess, to see what, if this is right, if this law we guess is right, to see what it would imply and then we compare the computation results to nature or we say compare to experiment or experience, compare it directly with observations to see if it works.” — Richard Feynman
What if the scientific method — a process that has been a cornerstone of discovery and development since the days of Aristotle — was going to significantly change in the next few years?
What if — instead of starting with a guess as Feynman describes — we use the overwhelming amount of available data to frame that initial hypothesis? In today’s world of nearly unlimited data accessibility, are we on a path to eliminate that first guess?
How long before we don’t even need to pontificate on the uncertainties any longer? How soon will these massive troves of data merely point us in the next direction?
It seems like an incredible efficiency. Why waste time guessing if we can just figure it out initially?
But with this path comes even greater reliance on the initial data. And with that, an even greater need to make sure we’re using it responsibly.
Unfortunately, this isn’t a strong suit for the majority of us.
The Double-Edged Sword of Stats
President Dwight Eisenhower once expressed astonishment and alarm upon hearing that fully half of all Americans have below average intelligence. He quickly realized his mistake and chuckled over his mistake, but gaffe aside, it can be easy to misread the nature of statistics.
Recent years haven’t improved this situation. We’re inundated with data claims on a daily basis. Newscasters seem intent on throwing numbers at us until we just tune them out.
So it’s not surprising that we create mental shortcuts to limit our own cognitive burden. Our minds develop biases and heuristics to conserve our precious mental energy. And there’s few bigger drains on it than evaluating data and statistics.
Yet statistics can be interesting. I recognize your skepticism, but donate 4 minutes to watching Hans Rosling’s brilliant distillation of 200 countries over 200 years and it’s difficult not to get caught up in his excitement. Or watch any of his TED talks which are more reflective of a sporting event than a statistics discussion. In Hans’s words,
“There’s nothing boring about statistics. Especially not today, when we can make the data sing. With statistics, we can really make sense of the world. With statistic, the “data deluge,” as it’s been called, is leading us to an ever-greater understanding of life on Earth and the universe beyond.”
As our decision-making process continues to evolve, and data and statistics play larger roles in not just assessing our hypotheses, but framing them as well, we have more responsibility than ever to use these tools effectively.
We can no longer tune these numbers out. And that starts with confronting these shortcuts and accounting for them. In the wise words of Daniel Kahneman,
“The best we can do is compromise: learn to recognize situations in which mistakes are likely and try harder to avoid significant mistakes when the stakes are high.”
For each one, the solution is merely awareness. To condition ourselves to recognize these tendencies and mentally guard ourselves against their influences. And while this is by no means a conclusive list, in my experience there are a common handful of biases and heuristics that seem to repeatedly influence our views of data and evidence. And limit the effectiveness of our decisions.
One of the main reasons that we need to pay attention to the accuracy of statistics is our tendency to anchor to irrelevant information. Once we establish our initial position (or anchor), we rarely adjust sufficiently from that position in the presence of new information.
We see anchors in everything from first impressions to asking prices. And once we associate ourselves with an initial anchor, we struggle to develop arguments to move away from it, defaulting us back to our initial position.
As if this wasn’t bad enough, anchors have also been shown to have an impact even when they’re completely unrelated to the topic.
Amos Tversky and Daniel Kahneman ran an experiment where they spun a wheel marked 0 to 100 that was rigged to only stop at 10 or 65 in front of a group of students. They then asked them to write down the number on which the wheel stopped, either 10 or 65. They then asked them two questions:
- Is the percentage of African nations among UN members larger or smaller than the number you just wrote?
- What is your best guess of the percentage of African nations in the UN?
While no one would say that the number on the wheel would yield relevant information about the percentage of African nations, people were still affected by it. The average estimates of students who wrote down 10 and 65 were 25% and 45% respectively.
In another study, participants were asked whether Gandhi died before or after age 9, or before or after age 140. Clearly these values are incorrect and shouldn’t have an impact on an estimate of when Gandhi died. But each group showed a significant difference in their estimates (average age of 50 versus 67).
While there’s multiple theories and explanations of why our minds anchor against initial values, the effect is still the same. Whether we realize it or not, we’re influenced by starting values. In response, we should assume that any starting value will have some level of anchoring effect on us. And we need to recognize consciously take steps to dissociate our mental position from this initial anchor.
Insensitivity to Sample Size
“They say 1 out of every 5 people is Chinese. How is this possible? I know hundreds of people and none of them is Chinese,” Carl Sagan joked in The Demon-Haunted World, giving a tongue-in-cheek example of how easily we fall astray of reality when we neglect sample size.
If I flipped a coin four times and got heads on three of them, you probably wouldn’t think much of it. But if I flipped it four million times and got heads on three million of them, you’d probably start looking more closely at that coin.
We quickly recognize the difference between a sample size of four and four million, but we’re less successful at differentiating between less extreme variations. Or worse, we neglect to even inquire about the sample size before running with the conclusions. Kahneman attributes this to our tendency to fixate on individual stories,
“The exaggerated faith in small samples is only one example of a more general illusion — we pay more attention to the content of messages than to information about their reliability, and as a result end up with a view of the world around us that is simpler and more coherent than the data justify. Jumping to conclusions is a safer sport in the world of our imagination than it is in reality.”
Unless we immediately negate the message, we get caught up in the story over focusing on the accuracy of the data. We frequently see this in engineering developments and justifications, relying on past precedent over analytical assurance. As Steven Vick points out in Degrees of Belief: Subjective Probability and Engineering Judgment,
“If something has worked before, the presumption is that it will work again without fail. That is, the probability of future success conditional on past success is taken as 1.0. Accordingly, a structure that has survived an earthquake would be assumed capable of surviving with the same magnitude and distance, with the underlying presumption being that the operative causal factors must be the same. But the seismic ground motions are quite variable in their frequency content, attenuation characteristics, and many other factors, so that a precedent for a single earthquake represents a very small sample size.”
Years ago, a report came out saying 41% of Muslims in the US supported jihad. Now there was a host of issues with this study, including the fact that it was an online opt-in study with no real guarantee that anyone taking it was actually Muslim. And the fact that the majority of respondents defined jihad as a “Muslim’s personal, peaceful struggle to be more religious.” But another major flaw was that the respondents totaled 600 people. In a country of over 3.3 million Muslims, no one should consider 600 to be a representative sample.
But none of these degenerate methods stopped multiple news outlets from parroting out misleading clickbait headlines.
In these situations, awareness is our best weapon. Ask questions. Be skeptical. And make sure you’re checking the validity of the sample before becoming swept up in the story.
Which childhood vaccine seems more dangerous, one that carries a .001% chance of permanent disability or one in which 1 of every 100,000 children will become permanently disabled?
Most people will instinctively choose the latter, even though a few moments of consideration show that the relative risks are the same. As Kahneman explained,
“The second statement does something to your mind that the first does not: it calls up the image of an individual child who is permanently disabled by a vaccine; the 99,999 safely vaccinated children have faded into the background.”
Paul Slovak termed this phenomenon denominator neglect, showing that people more heavily weight low probability events when seen in terms of relative frequencies as opposed to probability or likelihood.
In another example, people saw information about a disease that kills 1,286 people out of every 10,000 and another that kills 24.14% of the population. People found the first disease to appear more threatening, even though the risk is only half of the second.
Savvy practitioners will exploit this tendency when presenting statistics. Someone wishing to incite fear may report that “approximately 1,000 homicides a year are committed nationwide by seriously mentally ill individuals who are not taking their medication.” While another way of reporting the data would be to say “the annual likelihood of being killed by such an individual is approximately 0.00036%.”
Let your analytical mind do it’s work. Ask yourself whether you’re seeing the full picture. Often the difference between getting swept up in an emotional story and recognizing the best analytical choice is a few moments of dispassionate consideration.
Misconceptions of Chance
“Chance is commonly viewed as a self-correcting process in which a deviation in one direction induces a deviation in the opposite direction to restore the equilibrium. In fact, deviations are not ‘corrected’ as a chance process unfolds, they are merely diluted.” — Amos Tversky and Daniel Kahneman, Judgment Under Uncertainty: Heuristics and Biases
How often have you seen people load up their roulette bets on black after the wheel hits red a couple of times?
Just as we mistakenly extend the influence of small sample sizes, we often expect the probability of broader outcomes to apply themselves immediately. As Peter Bevelin writes in Seeking Wisdom,
“We tend to believe that the probability of an independent event is lowered when it has happened recently or that the probability is increased when it hasn’t happened recently.”
We take a broad focus of behavior and mistakenly believe that chance events will quickly self-correct. Except nature doesn’t abide by a sense of fairness. And as the annoying guy at the roulette table likes to remind you, “the wheel has no memory.”
We often apply a skill philosophy to chance events, tricking ourselves into believing that we can predict future events based on past ones. Shane Parrish offers a good strategy for discerning the difference, suggesting we ask ourselves whether we can lose on purpose. If not, the previous instance was likely the result of chance, and shouldn’t be given any predictive credibility.
Taken at a broad view, chance events will eventually balance out. We just need to resist the tendency to expect this broad view to occur right now.
Insensitivity to Base Rates
Similar to anchoring and our availability bias, an insensitivity to base rates causes us to favor available information over more relevant data in making decisions. In one example, Daniel Kahneman and Amos Tversky invented a fictitious woman named Linda and gave her the below description:
“Linda is thirty-one years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in antinuclear demonstrations.”
Now, which statement do you consider more likely:
People were then asked to cite which statement was more likely:
- Linda is a bank teller.
- Linda is a bank teller who is active in the feminist movement.
Did you choose the latter option? It’s easy to see someone like Linda being involved in feminist causes. And it’s much easier to picture her in that role than as a bank teller. Undergraduate study participants agreed. Nearly 90% said that Linda was more likely to be a feminist bank teller than a bank teller.
Except this choice completely defies the laws of probability. Since all feminist bank tellers are included within the overall base of bank tellers, the probability of Linda being a feminist bank teller must be lower than the probability of her being a bank teller.
It’s easy to see when we think through it, but much more difficult to recognize when our minds grab onto the story and neglect the base rate. Instead, take the time to establish the appropriate reference base rates up front. Then, use the available information to alter the likelihoods from those starting points, instead of jumping to the finish line based on the story alone.
Be Skeptical. Be Analytical. Be Informed.
“Being a scientist requires having faith in uncertainty, finding pleasure in mystery, and learning to cultivate doubt. There is no surer way to screw up an experiment than to be certain of its outcome.” — Stuart Firestein, Ignorance: How it Drives Science
With each technological development, our capacity for more accurate predictions and more informed decisions continues to improve. And as this available data continues to influence not only our assessments, but also our initial hypotheses, the need for responsible analysis increases with it.
But while the scientific method may see some change, the intent behind it does not. We still need to make sure that we anchor our initial positions within the bounds of logic. And we still need to question the credibility of any evidence that either confirms or conflicts with this position.
None of the biases and heuristics here are overly complicated. They just require us to stop and question the evidence that is now more easily accessible than ever before.
Above all, take responsibility for the information that you’re taking in. As the father of modern anthropology, Claude Levi-Strauss, articulated over 50 years ago,
“The scientist is not a person who gives the right answers, he’s the one who asks the right questions.”
Thanks, as always, for reading. If you enjoyed this or have any suggestions, please let me know your thoughts. I’d love to hear from you. And if you found this helpful, I’d appreciate if you could clap it up👏 and help me share with more people. Cheers!