Correlation Is Not Causation

Indi Young
Inclusive Software
Published in
3 min readAug 6, 2020


We’ve been trained to respond to statistics with curiosity about more statistics.

In the spring of 2015, 210 women in Silicon Valley in senior technology positions participated in a survey. The results of the survey was published as Elephant in the Valley, with hope to raise awareness about issues facing women in the workplace.

The hellebore: one resilient flower.

One of the survey results stated 60% of women in tech have experienced sexual harassment. A woman and I were chatting about this about this particular result. She is at the beginning of her tech career, and she said it had never happened to her. I expected her to take the conversation in the direction of how the most recent generation in the workplace have different habits and will hopefully eliminate sexual harassment, but instead she said, “You know, I’m curious. I wonder how that breaks down by size of company or by location.” Meaning, does the size or location of the company have any influence on sexual harassment? Maybe she just meant that bigger companies have more incidents because they have bigger populations. Because I’ve been talking about how tech falls into the trap of equating correlation with causation so much, I interpreted her curiosity as following this conventional path. We’ve been trained to respond to statistics with curiosity about more statistics.

Conventional Curiosity

For reasons probably stemming from tech & business culture’s love of truth and fact and science and numerical proof, it is a natural step to express interest in a statistic in terms of correlations between more numbers and behaviors. But there are other ways to express curiosity. In the particular case of sexual harassment in Silicon Valley, I’d look for data that reveals the patterns of inner thinking and guiding principles of the individuals perpetrating the violations.

Unfortunately, there are not a lot of studies that gather this inner thinking. About any topic. The rare team does this kind of research. It’s still uncommon.

So, even though everyone knows that a demographic like the size of a company isn’t going to cause a person to perpetrate sexual harassment, demographics are still the way most people express curiosity about it. Even if there is a correlation between size of company and sexual harassment, it’s not a cause. It might be precipitated by those around them at work, but the roots usually started growing in a perpetrator’s mind much earlier.

The reason that I’m more interested in understanding the inner reasoning, reactions, and guiding principles associated with behavior, even icky behavior, is that it reveals real reasons for behavior. These foundational reasons are what we can use to build things that can support or change the way a person acts. For example, are there patterns of thinking that we can identify and try to speak to, in order to build awareness about sexual harassment and other issues? Can awareness of one’s own style of thinking help to change that thinking and prevent incidents?

The Corollary

The other half of the problem I see is that teams make design decisions in favor of the behavioral group with the highest correlation to a certain situation. Instead of considering all the different behavioral groups, teams only seem to recognize one group. Making design decisions in support of one group, which possibly represents only 68%* of the audience, is a bad idea. It leaves 32% of the group unsupported directly. Worse, of the 68% who are supported, many will not be in the right frame of mind for what you designed. In their book Design for Real Life, Sara Wachter-Boettcher and Eric Meyer warn that you can’t always predict who will use your products, or what emotional state they’ll be in when they do.

With awareness of this tendency, and with deeper understanding of people’s inner thinking forged through qualitative research, I hope the future brings support to many different behaviors with different solutions. And I hope the product teams take the time to forge a valid understanding of the inner thinking of these audiences.

* See also Brandon Schauer’s presentation from the 2016 Managing Experience Conference where he talks about standard deviation on either side of the mean.



Indi Young
Inclusive Software

Qualitative data scientist, helping digital clients find opportunities to support diversity; Time to Listen —