Why you don’t need a representative sample in your user research

Engaging a representative sample of participants in user research sounds like a good idea but it is flawed. It requires lots of participants, does not work in an agile development environment, stifles innovation and reduces your chances of finding problems in small sample usability tests. When combined with iterative design, theoretical sampling (where theory and data collection move hand in hand) provides a more practical alternative.

Photo by Ethan Weil on Unsplash

Sooner or later when you present your design research findings, someone will question your sample’s demographic representativeness. “How can you possibly understand our audience of thousands by speaking with just 5, or even 25, people?”, you’ll be asked. “How can you ensure that such a small sample is representative of the larger population?”

This question is based on the assumption that demographic representativeness is an important characteristic of design research. But this assumption is wrong for four reasons. The first two reasons are practical:

  • Representative samples require lots of participants.
  • Representative samples are difficult to achieve in an agile development environment.

And the second two reasons are methodological:

  • Representative samples stifle innovation.
  • Representative samples reduce your chances of finding usability problems that affect a small number of users.

Let’s look at each of these in turn.

Representative samples require lots of participants

An obvious argument to make against demographic representativeness is that it results in sample sizes that are too large for almost all design research. For example, to make a sample representative of a demographic, you would aim for an equal balance of gender (male and female), domain knowledge (experts and novices), technical knowledge (digital savvy and digital novices), device types used (desktop and mobile), geographical location (urban and rural), age (Gen X and Millennial)… as well as additional, specific characteristics important for your particular product. To get even rudimentary coverage of these characteristics you will need a large sample size.

For example, with the characteristics I’ve listed above you’ll need a sample size of 64 to end up with just one person representing a target demographic:

  • To achieve a representative demographic sample, our 64 participants will comprise 32 men and 32 women.
  • Our 32 men will comprise 16 domain experts and 16 domain novices.
  • Our 16 male, domain experts will comprise 8 participants who are digitally savvy and 8 that are digital novices.
  • Our 8 male, domain expert, digitally savvy participants will comprise 4 desktop users and 4 mobile users.
  • Our 4 male, domain expert, digitally savvy, desktop participants will comprise two participants in an urban location and two participants in a rural location.
  • Our 2 male, domain expert, digitally savvy, desktop, urban participants will comprise one Gen X and one Millennial participant.

And what if that one participant is unusual in some other, non-representative way? Can one participant ever be representative of a segment? It seems that all we’ve done is move the issue of “representativeness” further down the chain. Boosting the sample size in each segment from one to (say) 5 participants (to make it more representative) means we now need a sample size of 320 participants. Very quickly, our sample size escalates dramatically as we build in more ‘representativeness’.

This isn’t practical for design research.

Representative samples don’t play nicely with agile

A second reason a representative sample is impractical is because it doesn’t work with agile development. Recall our sample size of 64 participants from the previous section. This belongs to a world where we can define our research problem up front and plan exactly how to arrive at a solution. Yet no modern software development team works like this because requirements can’t be nailed down in advance. Instead, development teams rely on iteration — and this is the same approach we should adopt as user researchers.

For help, we can turn to a different approach to participant sampling used by qualitative researchers. This approach is very different to the criteria researchers use for statistical sampling. In particular, the qualitative researcher does not sample participants randomly. Instead, theory and data collection move hand in hand. The qualitative researcher simultaneously collects and analyses data while also deciding what data to collect next and what participants to include. In other words, the process of data collection is controlled by the researcher’s emerging understanding of the overall story. This sampling technique is known as “theoretical sampling”.

This means the user researcher should select individuals, groups, and so on according to the expected level of new insights. You want to find users that will give you the greatest insights, viewed in the context of the data you’ve already collected.

It’s easy to see how you can adapt this approach to working in sprints with an agile team. Rather than do all of the research up front, we do just enough research to help the team move forward. Our sample size and its representativeness both increase as the project develops.

These two practical issues of representativeness — that it requires a large sample size and it doesn’t fit with agile ways of working — are important. But they do not fully address the point made by our critic. Practical research methods are great but we can’t use impracticality as our defence against shoddy research.

But these are not the only issues.

There are methodological issues too. Aiming for a “representative” sample in user research stifles innovation and it reduces your chances of finding problems in small sample usability tests. Let’s turn to those issues now.

Representative samples stifle innovation

A third problem with representative samples is that they stifle innovation. With research in the discovery phase, when we are trying to innovate and come up with new product ideas, we don’t know our audience.

Stop for a second and let that sink in: we don’t know our audience — or at least, we know it only roughly. There is no list of people that we can select from because we don’t have any customers — we may not even have a product. Indeed, part of the user researcher’s role in the discovery phase is to challenge what their team think of as “the product”. The role of the user researcher is to help development teams see beyond their tool to the user’s context, to understand users’ unmet needs, goals and motivations.

Since we don’t know who our final audience will be, it’s impossible to sample the audience in any way that’s representative. It would be like trying to sample people who will be driving an electric car in 2030. Even if we already have a list of customers who use an existing product, we can’t use only those people in our discovery research, because then we are speaking only to the converted. This reduces opportunities for innovation because we are speaking only to people whose needs we have already met.

Instead, to be truly innovative, we need to discover the boundaries and the rough shape of the experience for which we are designing. Rather than make predictions about an audience, innovative discovery research tries to make predictions about an experience. You’re creating tests and hypotheses to help you understand what’s going on.

As an example, let’s say I want to understand how people use headphones because I want to innovate in the headphone product space. I don’t pick a representative sample of headphone users. Instead I start somewhere — almost anywhere. Maybe with a commuter who wears headphones on the train.

Then I ask myself: “Who is most different from that user? Who would be the ‘opposite’?” That leads me to someone who has an entirely different context: perhaps an audiophile who uses headphones only at home.

But I need to explore this space fully. So let’s look at some of the edges: working musicians; sound recordists; teenagers who listen to bouncy techno.

Let’s look further and, to adopt the terminology of jobs-to-be-done, we might question what “job” the headphones are doing. If people are using headphones to shield out noise at work from co-workers, then maybe we want to understand the experience of people who wear ear defenders. If people are using headphones to learn a new language on their commute then maybe we want to look at the way people learn a foreign language online.

One of my favourite examples comes from IDEO. They were designing a new kind of sandal. They expressly included outliers in their sample, like podiatrists and foot fetishists, to see what they could learn. This is what I mean by understanding the boundaries of the research domain.

Representative samples reduce your chances of finding usability problems that affect a small number of users

We can’t use this same defence when it comes to usability testing. Now we know the rough shape of our audience: it would be foolish to involve (say) foreign language learners in a usability test of headphones aimed at working musicians. We need to match our participants to the tasks that they carry out.

But recall that a usability test typically involves a small number of participants (5 has become the industry standard). This is because 5 participants gives us an 85% chance of finding a problem that affects 1 in 3 users. However, some important usability problems affect a small number of users. On some systems, testing 5 users may only find 10% of the total problems, because the other 90% of problems affect fewer than 1 in 3 users.

To get the most value out of our usability test, it therefore makes sense to bias our sample to include participants who are more likely to experience problems with our product. This type of person might be less digitally savvy or may have less domain expertise than the norm.

This means you want to avoid having too many participants in your usability test sample who are technically proficient (even if they are otherwise representative of your audience). This is because these types of participant will be able to solve almost any technical riddle you throw at them. Instead, you should actively bias your sample towards people with lower digital skills and lower domain knowledge. Including people like this in your sample will make it much more likely you’ll find problems that affect a low proportion of users. This helps you make the most of your 5 participants.

Just to be clear, I’m not saying you should test your product with total novices. Participants in a usability test should be (potential) users of the product you’re testing. If your product is aimed at air traffic controllers, that’s where you draw your participant sample from. But to make most use of your small sample, recruit air traffic controllers who have less domain knowledge or lower digital skills than the norm for that group. In other words, bias your sample towards the left of the bell curve.

Your defence against non-representativeness is iterative design

There’s always the (unlikely but statistically possible) chance that every one of your participants in a round of research is unrepresentative in an important way. This will send the development team off at a tangent and risks derailing the project. For example, recruiting usability test participants who are less digitally savvy than the norm may result in false positives: mistakenly reporting a usability problem when one doesn’t exist. Why isn’t this more of an issue?

The reason this isn’t a serious issue is because of the power of iterative design. We involve a small sample of participants in our research and make some design decisions based on the outcomes. Some of these decisions will be right and some will be wrong (false positives). But with iterative design, we don’t stop there. These decisions lead to a new set of hypotheses that we test, perhaps with field visits to users or by creating a prototype. In this second round of research we involve another small sample of participants — but crucially a different sample than before. This helps us identify the poor design decisions we made in earlier research sessions and reinforces the good decisions. We iterate and research again. Iterative design is the methodology that prevents us from making serious mistakes with our research findings because it leverages the power of the experimental method to weed out our mistakes.

I titled this article, “Why you don’t need a ‘representative sample’ in your user research” but I could have gone further: in my view, you should actively avoid a “representative sample”. That’s because our goal is not about delivering a representative sample but about delivering representative research. User researchers can achieve this by combining iterative design with theoretical sampling.

What next

Want to think more strategically about UX? See what I’ll be covering in my new course, User Experience Maturity: Strategy and tactics (London, 13th Sept, 2018).

Originally published at www.userfocus.co.uk.