Data-driven off a cliff: Rekindling a love for the unknown and immeasurable

I became interested in experimental psychology after reading a book on how people pursue happiness.

Back then, I was fascinated by how researchers could take this completely abstract concept that existed in people’s heads, and not only measure it, but manipulate it, by tweaking the environments their research subjects were placed in. I wanted to learn what made people happy and motivated, so I studied psychology in college, graduated top of my major, went to grad school, and got a PhD in social psychology.

Along the way, I became less interested in people, and more interested in how we studied people. One of my favorite classes — that I barely passed (sorry, Pat) — was Psychometric Theory. Psychometrics is essentially the study of how to take these abstract constructs like happiness or motivation, and measure them: How to construct scales assessing how competitive someone is, for example, and how to judge the reliability and validity of that scale.

Also along the way, I discovered that studying people was hard. And not just “it’s really hard and I want to give up” hard, but “it’s really hard and researchers are people too, so they end up doing some really funky, non-Kosher stuff” hard. Many years from now, I’ll remember being in grad school when the Replication Crisis made landfall. This was a series of high-profile attempts to replicate the results of famous studies in psychology, and the, for a significant number of them, the subsequent failure to do so. These failed replications then spurred a debate around questionable research practices, their prevalence, and their impact on existing bodies of research.

You can find the details of the above saga elsewhere, but in the end, nobody was happy, everyone agreed that something had to be done, but very often involving things that other people did.

Now, at the end of my academic journey, I’m only confident about a couple of things. The first is that I know very little about human beings — spending five years trying to predict what people might do a few minutes after exposing them to some stimuli, and proceeding to get that wrong will make you feel the same way. The second, which is more relevant to this essay, is that people really, really, dislike not knowing stuff.

I think this second impulse is responsible for a lot of the “wrongness” in the world. From researchers fabricating data, to companies selling products that don’t exist (*cough* Theranos *cough*), there are many things that we can “do” to data (generally, modifying or creating it) in order to get what we want.

However, something that I’ve been thinking about recently is the opposite problem: How the existence, or potential existence of data changes what we want.

Here in the San Francisco Bay Area, being “data-driven” is fashionable. I’ve heard companies describe themselves as being “obsessed” with data. I’m not sure it’s healthy to be obsessed with anything. And from the admittedly narrow lens of product development, I don’t think it’s healthy to be obsessed with data.

Good products are important. For better or worse, their influence on our lives is multifaceted, profoundly affecting how we behave in the long run, and change the way we see the world in abstract, but meaningful ways. If someone took away your air conditioner, car, or refrigerator tomorrow, the thoughts and feelings you’d have could scarcely be captured on a questionnaire or survey.

I think this is true of all truly important things — they are difficult or impossible to measure.

I should clarify: It is supremely easy to put a number or a description to those important thoughts and feelings. You just have to ask someone to do it, and “rate how much they love their children on a scale of 1 to 10”. However, it is hard to do so in a meaningful way. Today, despite all the concern and controversy surrounding data privacy, hundreds of millions of people log on to social media platforms like Facebook or Twitter every day. Yet, I’d be willing to bet everything I own that if you go digging in their trove of data from yesteryear, no single data point, or collection of data points, could have predicted how insidious the hold of these products would be.

On the other hand, there are many things that are easy to measure, and a lot of them are actually helpful. How many people who visit your site convert to paying customers; how likely people are to recommend your product to a friend — or whether they’ve done it in the past; how disappointed people would be if your product didn’t exist. All of these things are easy to measure, and are useful in so far as we should continually strive to make them better.

Where it starts to go wrong is when we confuse the ease of getting data with how important that data is. We all want users or customers who have a trust in our brand that transcends the ups and downs of business life. But the pathological obsession that compels us to continually search for a way to measure and optimize the things that we can’t admit are intangible, results in short-term-ism that might cost us just that. There is a data point that you can measure right now that tells you whether your users will tolerate you charging them 20% more for your product. There is no data point that tells you what will happen when you’ve done that five times in a row and people have had enough of your sh*t.

I guess what I am imploring us to do, and this is extremely ironic coming from a research methods nerd, is to take a step back from the data, regardless of how easy it is to collect, or how good it looks, or how much more money we could all make. Instead, I think we should spend more time thinking about what the right thing to do is, beyond the next month, year, or decade.

Because the right thing to do is probably going to be uncomfortably unmeasurable.