I am on a quest to show how Statistics will make the world a better place. Before you dismiss me as a fanatic who reduces everything down to a statistic, hear me out because I’m not. Here are 3 concepts that I believe can decrease conflicts in the world and encourage constructive dialogue from statistics.
Concept 1: Not all problems are data problems
Let’s ignore distracting issues such as the personal problems with your significant other and ignore the difficulty in data collection. There are caveats about data that people often forget: the amount of data you need cannot keep up with the complexity of the problem, some data can be worse than no data, and some things just cannot be quantified.
the amount of data you need cannot keep up with the complexity of the problem
Let’s start with the simple version of life where everything is quantifiable. I argue that we cannot collect enough data to address the complexity of all problems.
To visualize this, see the left hand graph below where we use 10 points to “cover” a line. Imagine it as my attempts to understand the factors in deciding when to have children. Now look at the right hand graph that tries to cover a 2D surface with twice as many points. This is similar to me now wondering where to raise my children in addition to the original question. Hopefully you agree that the 2D surface could use more points to have the same “density” as the 1D case before. In our analogy, I would need to interview a range of people for each location to remotely understand what the optimal choice would be.
Statisticians call this the “curse of dimensionality:” the amount of data required increases exponentially as the complexity of the problem increases. Hence the more detailed and comprehensive your personalized recommendation system becomes, the more data you require.
Therefore, even if you could quantify and measure everything perfectly, you cannot rely on data to solve all problems.
Some data can be worse than no data
Now let’s imagine a universe where we might not know what to measure or what we want to measure is truly difficult. In both of these cases, we often opt for proxy data (proxies). And optimizing for proxies can lead to the opposite outcome.
For example, learning is hard to gauge so students use grades to proxy their comprehension. However, students who overly optimize for grades will be encouraged to take fewer risks. I have students who are afraid of adding justifications to their TRUE/FALSE assignments because doing so increases the chances of them losing a point. While I can sympathize with their situation, I would argue that learning the material with the risk of failing is better in the long run than earning points at the risk of not learning.
For companies, data can give a false sense of security and blind the leaders to the company’s issues. A company can chase metrics like a dog after its own tail. In the first quarter you prioritize page views over users, in the second quarter you target unique users over profit, and in the third quarter you aim for profit over engagement. By the end, different features/efforts cannibalize each other, the company accumulated more tech/data debt, and the executive team is puzzled by the sluggish development despite all the promising metrics.
I can agree that these proxies were used incorrectly and perhaps the alternative world of no data would be worse. But we too often forget most metrics are proxies, and then when the damage is apparent, it’s too late to turn back. Cathy O’Neil details many examples in her book Weapons of Math Destruction, where optimizing proxies have lead to disastrous societal outcomes and not having the data might have been better in the long run.
A lot of concepts are not quantifiable
But how do you quantify happiness? What about the importance of a liberal arts education? What about company culture? A lot of problems are not quantifiable or are unethical to study, or they’re too complex and they change too fast. Not all problems can be reduced or solved by optimizing a few objectives. I would argue that a lot of our problems arose today because we have overly optimized and/or because we are overly confident in our data.
Concept 2: Hypothesizing what else could explain the outcome
Part of acknowledging that not all problems are data problems requires imaging “what else”. I argue that this imagination for other possibilities is actually a statistical concept as well.
Here’s a thought exercise. Recently, I gave a homework assignment that asked my students to try out a different definition of a “best fit” line from what we were teaching in the lecture. The left graph is what we normally teach in class (regression) and the right graph is what I assigned in the homework.
To my surprise, some students ignored the homework instructions and just performed the usual regression calculation. What do you conclude from this observation?
You can argue this reflects the poor quality of the students, the flaws in our education system, or my inability to communicate clearly. But if you immediately jumped to one explanation of this phenomenon, you could use a refresher in applied statistics!
In statistics, we constantly struggle with the many possibilities that could explain the data at hand. Just like the 2 graphs illustrated above, there can be different models that produce the same data points. I truly believe that applying this level of imagination in everyday life can alleviate many conflicts and problems.
Hypothesizing different ways that can produce the same data.
In romantic comedies, the twist often challenges our assumptions of what events could lead to the same behavior, e.g. a lying husband isn’t necessarily cheating but might be planning a surprise birthday for you. At work, you should not dismiss the lack of minorities in leadership positions as a “the talent doesn’t exist” problem. And for data scientists, your failing model in production could be due to the biased data you had or… it could be you.
Turns out, this skill is hinted at in several places in the core statistics curriculum: the interpretation of a failed hypothesis test, the theory that correlation doesn’t equal causation, and the concept of the complement in probability.
In a tech company, people run AB tests to see if certain features detectably influenced the company’s bottom line. The logic behind AB test goes as follows: if the two user experiences are similar (the null hypothesis) then user engagement behavior should not reveal a huge difference. If you do observe a huge gap, however, then your original assumption that the experiences are similar is likely false. What’s left is to pray that the metrics moved in your favor rather than showing a negative impact. However, one of the best kept secrets in industry is that most tests do not produce a detectable difference.
The usual industry response is then “at least no one is hurt by the change” so there should be no reason to abandon the invested work. However, not detecting a change does not mean no change exists! Statisticians describe this as “failing to reject the null hypothesis”. Understandably, this wording confuses many students. In plain words, there are too many alternative explanations for a non-detectable difference where “there truly is no difference” is just one of those reasons. Perhaps the test only had 2 users, the test never happened because the code was not deployed due to a bug, or perhaps the difference is much smaller than you thought. Jumping to a single conclusion that justifies your promotion but complicates the code base is often not the right choice.
The concept behind correlation does not equal causation is the same: there are too many possibilities that could explain the same outcome. The uptick in unique users to our website could be due to the recent marketing team’s efforts rather than our product launch.
And in probability, the complement is “what else”. Sadly, this was often trivialized as an exercise for you to calculate the chance of getting at least 1 heads out of 10 coin tosses.
Why would this be a world-saving concept though? To many data scientists, the complement of “my model only” implies “not my model at all,” which often leads to toxic company politics and infighting. In actual politics, false dichotomies of “with us or against us” are another failure to internalize the many possible solutions. And for lay people, the inability to imagine or experiment will ultimately hinder personal growth and the ability to listen. Contrary to the “Assume good intentions” mantra heard in many HR trainings, I find this active imagination of concrete alternative explanations much more productive and rigorous.
To train for this, my previous post, recommends you to exercise your creativity like a muscle everyday. Who knew that art and statistics are not mutually exclusive fields?
Concept 3: articulating the evidence necessary to convince you otherwise
This is something I never learned in school. As a manager, I had to articulate specifics to justify a report’s promotion. As a data quality lead, I had to articulate the specific validations I expected to see in the product reviews. And as an employee, I had to articulate what the leadership would have to do to convince me to stay.
It turns out that being specific and complete about what would change your mind requires a tremendous level of introspection and experience. But when you realize that what you are demanding is not possible or you cannot imagine such evidence, it’s likely not a data problem or you’re being unreasonable. In either case, the hope is that you would take a step back and come up with a constructive path forward. All too often, people want to operate under the style “I will know it’s right when I see it” without challenging themselves to do the hard thinking.
To foster this culture, I think we need to first start by asking this question frequently and genuinely: “What do you need to see from me to change your mind? Would showing you [INSERT DATA] help resolve your concern?” This strategy has helped me a few times but can sometimes catch people off guard. People are often not ready to commit to a compromise in real time. This strategy, however, does force a constructive conversation to start. But the ideal state would be if people proactively provided the list of requirements to change their minds.
It is unfortunate that most statistical curricula rarely talk about real data collection. Some might even argue that data collection is the role of the scientist and not statistics. With a PhD in Statistics, even I only learned this through my sampling class. However, the ability to formalize scientific intuition into mathematical definitions, then execute the data collection, and then analyze is applied statistics. How can we bring this into the classroom to foster a much more constructive future?
Let statistics be the new global language
I believe if we can bring these three statistical concepts into our everyday attitude towards problems, societal conflicts would decrease and progress would be accelerated. In my past experience, statisticians, although often lacking subject matter knowledge, are skilled at unifying different aspects and experts together because of these concepts. Therefore, I’m hoping statistics can become the new global language.
Thanks for reading :)
Acknowledgement: many thanks to Henry Alford for the valuable feedback. The author assumes all errors and typos.