Qualitative Research Is Always Biased
As a qualitative UX researcher at Meetup, Twitter, and now Foursquare, I’ve spent thousands of hours thinking about biases in qualitative research. Since this is such an important part of my work, I wanted to take a few minutes to explain what I have learned about bias, and why I think biased data is still worth collecting.
So to start off, let’s talk about where these biases come from. Many of our biases result from mental models that were evolutionarily advantageous in past environments but don’t suit us well in our world today (h/t Timoni West). To see how our environmental shifts have resulted in cognitive mismatches, see Wikipedia’s popular and fantastic list of cognitive biases. When they see this, many people quickly go straight from morbid fascination to helplessness. After all, how can we design great software for such messy creatures?! Code respects logic. Somehow as designers of social software, we have to use this clean, robotic scaffolding to build something it’s poorly designed to support: utterly illogical systems designed to help bumbling humans connect with each other despite the 165 ways our brains are trying to thwart our best efforts.
I’ll go through some of the common ways bias creeps into my projects and I’ll flag what I’ve learned to do over the years to produce results I can still believe in. I hope you can find your own ways to thoughtfully engage with your own biases. You’ll end up with better results and you’ll learn a lot about yourself along the way. This will be most helpful to you if you are a new researcher or if you work with researchers and you want to better understand why biased data can still be directionally accurate.
Most of the biases I worry about are variations on selection bias in my participant pools. Wikipedia is truly a fount of knowledge; this list of statistics biases is a good overview of the main things to consider when choosing participants. Some of these are within my control and others aren’t. For example, I can control my recruiting criteria. I can choose to exclude participants who work at tech companies, participants with unusual user behaviors, participants who don’t fit the profiles we’re trying to serve with our products, etc. Depending on what our research goals are, there are very specific recruiting considerations we need to think through to make sure we’re getting feedback from the right people. The right data from the wrong people will still lead you down the wrong path!
However, I can’t control other aspects of selection bias. I can’t control who opens the email inviting them to the project or who answers the email and has the time to participate in a research project. Some of our studies require that people come to our offices; not everyone has the flexibility to travel to us to spend an hour chatting. When we incentivize studies we’re adding another bias; we’re probably oversampling deal hunters and low income populations like students. Conversely, if we were to do studies without offering incentives, we’d be introducing a different bias. We’d be hearing from people who really value being heard. Some people don’t have others around them to appreciate their unique perspectives and sharing their opinions with a researcher can feel really rewarding! There isn’t an unbiased solution, leaving me to choose the bias I am most comfortable living with.
Every time I introduce a known sampling bias, I’m also throwing unknown wrenches into the data I get back. For instance, maybe I’m disproportionately sampling students and maybe students are disproportionately likely to go out in large groups without reservations (I’m making this up). In this case, my recommendations might show that we should focus on features for big groups, even though that might benefit a minority of our users. What should we do about this? I have three pieces of advice.
- Choose who not to talk to: Discuss the ways your recruiting criteria might lead to selection biases with your product teams when you’re defining the project. When you decide who will be included, also think about who won’t be. Specifically write out which populations you are ignoring. It’s fine to leave certain groups out of your study population; focus is helpful! Just as product teams benefit from having clearly articulated “non goals,” you should also be clear about which voices likely won’t be heard in your study.
- Choose your methods: Different research methods will introduce different selection biases, some of which might particularly harm certain research goals. For example, if you really need to hear from busy people, choose a method that lets people give feedback on their own time, rather than asking them to come participate in an hour interview in your office. Maybe you would really prefer interviews. However, the interview data you’ll get from a skewed population may be worse than small nuggets of diary study feedback from the right people.
- Avoid extrapolating beyond your population: Look at all the data you get back after the data collection phase and carefully consider which data could have been affected by the sampling. To give an obvious example, I recently did a project with Swarm users in Turkey and asked participants which apps they use daily. We learned some really interesting things about digital behaviors. Of course we can’t extrapolate those findings out to the Turkish population more broadly. We learned interesting things about the specific group we heard from, which was valuable in itself. It’s important to always talk about your findings within the context of the targeted population; it would be easy to say “we know that people in Turkey love Snapchat.” That may or may not be true, but we know that Swarm users in Turkey do.
One good thing about most of these biases is that they are consistent from project to project. When the sampling is the same and the results are different, you can feel confident that the change happened due to changes you’ve made in your product, or changes in your users’ perceptions of you over time; that’s why baselining studies are so helpful.
Even when you’ve carefully considered sampling, there are essentially infinite ways to bias the results during the data collection phase. Again, just as with sampling, there are some biases in data collection which can’t be avoided. For example, every question I ask gives participants information about my research goals so the further along we get in a survey or an interview, the more likely people are to tell me what they have assumed I’m trying to learn based on all the questions up until that point.
These are the three most common avoidable things I’ve seen moderators do to introduce bias in the research data collection phase (There are about a zillion more):
- Asking people to report on past or future behavior. As a species, we aren’t great at accurately remembering what we did or predicting what we will do. People are most accurate when talking about the present. Of course, it’s not realistic to narrow the scope of an interview down to only focus on the current moment. Just know that when you are talking about the past or future, there could be inaccuracies. If it’s relevant, you can ground conversations about the past by asking people to bring records like calendar events or emails, or to prepare by looking at their transaction histories. If that’s not possible, be as specific as possible with your questions and don’t force an answer if someone can’t remember since you’re probably getting murky data.
- Asking questions where the desired answer is obvious. This is the classic “leading question” bias. If participants can tell what you’re hoping to hear it’s easy for them to play along. People generally want to be liked, and they may feel like you’ll like them more if they say what you want to hear. Check for positive or negative modifiers, and see if there’s a more neutral word you can use instead without sacrificing clarity. Walmart provided researchers with the classic example of what not to do when the leading word “cluttered” in a survey cost the company over a billion dollars.
- Asking narrow questions too early. The general flow of an interview should go from general, broad questions to specific, narrow questions. The earlier conversation gives context for later questions, and as a moderator, you can avoid asking things that might not apply to the person you’re talking to. If you are running interviews about social planning and you jump into questions like “Do you RSVP to Facebook events?” early in the interview, you’re probably missing out on a lot of important context about how your participant spends free time, who their closest friends are, whether they’re usually a follower or a leader in making social plans, etc. Avoid yes or no questions (like in the example above) and make sure that you’re using the information you’ve already collected through your general background questions to tailor your specific questions to the person you’re talking to. For example, once you have enough background, you could ask a question like “You said you’re spending a lot of time with your roommates and that they’re usually the ones deciding where to go. How do they tell you about their plans?”
We’re all human and these things happen. The best advice I have for you on dealing with moderator-introduced biases is to recognize the bias, acknowledge it with your team (trust me, they’ll love you for being honest), and then just throw out that data point. Data based on flawed questions is also flawed.
When I started my career I wanted to go into each study as a blank slate, waiting until I had research data to refer to. If I found myself expecting a certain outcome from a research project I tried to reset that expectation and remind myself that I hadn’t conducted the research yet and therefore had no basis for my guess.
I see things differently with 10 years of experience behind me. Wanting to be unbiased doesn’t make me so… and I’m not even convinced it’s a great goal anyway. Instead I’ve learned to examine my own biases and carefully think about how they might affect the study. Once I do that, I have enough experience to deal with the larger, study-tainting biases, and I can address the smaller biases with the team so they understand them too.
Which brings us to our final question: since there are other fantastic and less-biased data sources (like user metrics), why should we even bother? The reality is that qualitative research gives us data we can’t get any other way. It can answer “why” while other methods only answer “what,” “how,” “when,” and “where.” And luckily, we can ground qualitative feedback in user metrics so we can validate that a certain population really is worth building for. At this point in the tech industry, it’s an accepted truth that qualitative data and quantitative data tell a better story together than either data source tells on its own.
Qualitative research involves people which means it will always be complicated, but as the engineers around me would say, that’s a feature not a bug :)
I love talking shop with other researchers; if you want to chat more about this, get in touch!