Data Alone Isn’t Ground Truth

… and why you should always carry a healthy dose of skepticism in your back pocket.

I saw a chart the other day that highlighted the importance of bringing skepticism to any data analyses and visualizations we encounter. This post isn’t a dissertation, and it necessarily does not address the topic fully. But I think there are some good point I can make on the topic, and if it sparks a good conversation that I think it’s served a good purpose.

Here’s the chart that started me down this path:

The chart shows what appears to be an inverse relationship between average IQ estimates by college major and the proportion of women in said major’s undergraduate student population.

There is no implied interpretation in this chart, it merely shows a relationship. Still, were one to draw conclusions and make decisions based on it without scrutinizing it, those decisions would likely be deeply unfair. This thought made me feel the need (for what feels like the millionth time) to beat on my “data is not ground truth” drum set.

If you’ve seen me on Twitter, you may already be tired of hearing me say it by now.

Naturally, you may also be wondering, why is someone who pays their bills sciencing on data saying that “data is not ground truth?”

The short answer is that we can’t always trust empirical measures at face value: data is always biased, measurements always contain errors, systems always have confounders, and people always make assumptions.

Knowing these things are true allows a good scientist to look for them, account and correct for them, and (maybe) reach a factual understanding of truth.

Untrustworthy Data Will Lead to Poor Conclusions

Trusting all data as if it were fact is a dangerous proposition. Given how data-driven decisions already impact people in very real ways (see this, and this, and this, and this), it is important to remember that “data” can be anything from deeply curated measurements to random bits of information that might bear no relationship to reality.

Remembering this is crucial when you’re sciencing on data when you did not have a say in its collection, that is: any time you’re using someone else’s data set for a purpose different than originally intended.

While data can reveal a counter intuitive reality, it’s also important to be skeptical when data seems at odds with understood concepts or phenomena unbeknownst to all involved. For instance, have you seen “that basketball counting psychology video?” I don’t want to spoil it if you haven’t seen it yet, so watch it first and I’ll wait.

Awareness test from Daniel Simons and Christopher Chabris

Did you watch it? You really should watch it before continuing, I don’t want to spoil it for you 😉

Ok. Now imagine we were to ask observers whether there was anything out of the ordinary in the video above? Most respondents will likely say “there was nothing out of the ordinary.” It is unlikely that a respondent will say “I didn’t see anything out of the ordinary.” If we were to receive a dataset of responses, any data points that recorded “nothing out of the ordinary present” would be incorrect.

Given all the different decisions that get baked into the design of an experiment and the collection of interventional or observational data, it’s impossible to be sure that the way in which you are interpreting a data set is the way in which it was intended to be interpreted — unless you have access to the original designers, of course.

Good Data Science Anticipates That Problems Will Happen

Stas Kolenikov reminded us of a good article discussing the issue of bias in data analysis by Robert M. Groves and Lars Lyberg:

There is nothing wrong with looking at data and visualizing it to clarify what relationships it may be describing. But is it even worth our time to try and understand what’s on the page? The last thing you want to do is waste a bunch of time thinking about a chart only to realize it wasn’t meaningful at all because the data it analyzes was garbage to start with… Obviously, I’m not the first person to think this.

On two occasions I have been asked, — “Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?” In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
Charles Babbage in Passages from the Life of a Philosopher (1864)

Back to the original chart that compelled me to write this, I’m not saying that the data used for said chart is maliciously synthetic to try and push an agenda (e.g., a narrative that women aren’t as smart as men). That does happen, and I don’t think that’s what’s going on here. So, what is going on here? The laziest possible explanation that allows a casual observer to pretend there isn’t a problem with anyone who assumes this data implies a conclusion about the intelligence of men and women would be to assume that women have something inherent to them (maybe in terms of values, thoughts, or feelings) that lead them to prefer certain majors. But that’s putting the cart before the horse. Can we even trust these numbers?

This analysis did not look at IQ measures but instead looked at SAT scores and linearly transformed them into IQ scores, which can be a problematic assumption in itself. Still, IQ tests measure only certain aspects of intelligence — and the Flynn effect alone tell us something funny is going on with IQ measures.

IQ tests also do not evaluate many variables such as empathy or communication skills which are important aspects of professional success in any major. It would have been unsurprising (again, had IQ scores actually been originally measured) that “high IQ” would more highly correlate with STEM fields independently of exogenous reasons driving the gender ratios of those fields.

This is an interesting example that allows us to dissect the assumptions baked into this data visualization with confidence, given the well-researched tangents of preferential funding, disproportionate attrition, lack of respect, gate keeping, disincentives, thin representation in senior roles, etc. I’m not going to delve into whether women are more or less intelligent than men as there are entire academic disciplines already devoted to studying and explaining this better than I could (see this and this). My point is that being skeptical of data is essential, and that drawing conclusions about whether men or women are smarter is not possible from the data used to create the chart in question.

More generally, no one analysis will “prove” something. This dataset came from one population of students in one year, so until the finding is replicated in other (preferably better collected) datasets, it’s really only an observation — and one that should be taken with an iceberg-sized grain of salt.

So, can any data ever be trusted?

The short answer is… it depends. Skepticism is not a free pass to disregard data you disagree with. It’s a tool to ensure that the conclusions derived from data are reliable and do, in fact, reflect reality.

You also shouldn’t trust data just because it “proves” a point that you’re already inclined to believe. It’s probably even more important to be skeptical of extraordinary claims with which your heuristics already naturally align.

It takes a lifetime to build credibility, and it’d be silly of me to tell you to distrust sources of data that have gone through the effort of establishing their bona fides and ensuring nuanced and balanced approaches to analysis, such as: reputable sources with a track record of reliable results, transparent methods, use of scientific best practices, validation of results against other sources, and demonstrated efforts to confirm validity and to eliminate (or at least minimize) biases like survey quality checks that make sure a respondent is paying attention and not just answering at random.

Much like you shouldn’t trust a stranger who emails you about a once-in-a-lifetime opportunity to make money, you probably need to be skeptical of the sources and methods applied to the systems under observation, particularly when you don’t have direct knowlege of those sources, methods, and systems.

And make sure you build a habit of reading and thinking through the fine print, even if it’s not written down… 👍

Many thanks to Chris Albon, Husain al-Mohssen, Mara Averick, Ed Cuoco, Ben Kehoe, Randy Olson, Jason Emory Parker, Hilary Parker, Mikhail Popov, and Andrew Therriault for reading through this essay and catching my errors in spelling, logic, and framing, as well as adding much needed clarity and evidence to it.

Obligatory Disclaimer: Any views or opinions presented herein are solely mine and do not necessarily represent those of any company I have been, currently am, or will ever be associated with. The postings on this site and any site wherein I share content are my own and don’t represent my employer’s (or anyone else’s) positions, strategies, or opinions.