The Case for Talking to Users in the Age of Big Data
or, How I Learned to Stop Worrying and Love Small Sample Sizes
I remember explaining the nature of some qualitative design research I was doing to a friend who happens to be a digital product PM.
“So you’re going to talk to six users today, and observe them using the product?”
“Correct,” I said.
“And— don’t take this the wrong way— that counts as research? I mean, six people isn’t enough to get a representative sample of anything, is it?”
That’s a fairly common reaction to qualitative design research, in my experience. A lot of people are dubious out of the gate of its small sample sizes and worry that anything you learn from it will be subject to chance and interpretation and therefore not a reliable way to produce a generalizable finding.
They’re totally right.
“Correct,” I said again to my friend.
You won’t get generalizable, projectable data from talking to six people. It’s pretty close to impossible. If there are numbers that can tell a story you hear in an interview with a user, the numbers will tell it more accurately than that one user will.
But, of course, there aren’t numbers for everything. And that’s the dead-simple reason why qualitative research is important despite its small sample sizes. Good qualitative research doesn’t try to measure easily measured things. Sometimes it can offer a glimmer of things that can be measured in a projectable way, in which case it can often be a good guide for quantitative work. But at its best, it will look for and explore all the stuff numbers can’t tell you.
And when it comes to understanding how people are using and interacting with a product, that is a lot of stuff: when someone tries to complete an action, do they struggle to do it? Once they’ve completed the action, do they understand what they’ve just done and all of the implications of having done it within the context of a broader system? Do they struggle at all to check on the status of the completed action?
It’s true that you could probably infer some answers to questions like this from behavioral metrics. You could probably even develop a survey instrument to measure them in the moment. But a quantitative approach like that may be difficult to implement in terms of time and technology. And even if it’s easy to implement, it may capture the wrong data. A KPI showing you whether or not a user accomplished an action tells you nothing about the ease with which they did it or if, in fact, they knew that they even did it in the first place. Similarly, a survey question will provide you with a static response about their mindset that will invariably be shaped by the question you asked, the answer architecture you provided and the context in which you asked it.
Talking to users can get around a lot of this. Human interaction is actually pretty data-rich stuff, even if that data isn’t easily quantifiable. And while it, too, is subject to the biases and unintended influences of how it’s framed and conducted, it’s also incredibly nuanced. When we communicate person-to-person without any instrumented mediation, we communicate subtlety, nuance and conditions very efficiently. We can say why we didn’t like something, or why we might have liked something if only it were executed in a slightly different way or if our motivations happened to change for any reason. We’re able to express a complex sentiment like that in just one sentence during a conversation with another person, but one would need at least three separate metrics to maybe, possibly capture it quantitatively (assuming they thought to measure them in the first place).
My favorite part of Nate Silver’s The Signal and the Noise was Silver’s story about attempting to interview Dustin Pedroia before a Red Sox game. Silver picked up on something from seeing the diminutive second baseman warm up. Pedroia was focused, deadly serious, suffering no distractions.
As I watched Pedroia take infield practice, grabbing throws from Kevin Youkilis, the team’s hulking third baseman, and relaying them to his new first baseman Casey Kotchman, it was clear that there was something different about him. Pedroia’s actions were precise, whereas Youkilis botched a few plays and Kotchman’s attention seemed to wander. But mostly there was attitude: Pedroia whipped the ball around the infield, looking annoyed whenever he perceived a lack of concentration from his teammates.
Silver— the very personification of the power of big data over human interpretation in popular culture today— knew that this meant something. In that passage, you can practically hear the voice of Joe Morgan in Silver’s brain saying something to the effect of, “This kid’s a gamer.” It’s just the type of unmeasurable and unquantifiable observation that would make many sabermetricians shiver. But Silver recognized it as a form of data, and that it would be every bit as foolish to assess Pedroia as a player without considering it as it would be to assess him exclusively according to it. “The key to making a good forecast,” Silver writes later in the baseball chapter, “is not in limiting yourself to quantitative information. Rather, it’s having a good process for weighing the information appropriately.” Ditto for good product research.
Observing users in person provides you with data that surveys and behavioral data simply can’t, just as surveys and behavioral metrics provide you with data and reliability that qualitative work can’t. You need both— and you need to do both well— if you’re serious about understanding how people use your product.
It’s the qualitative design researcher’s job to investigate the things for which there are no easy measurements. They are the only instrument out there for measuring human responses to a product in a human way, something that numbers can’t do. At least not yet.