5 users is not enough ☹

MC Dean
Designing Atlassian
4 min readJun 2, 2015

A question that often comes up in design circles is how many users you should test with, and why.

There’s a lot of science around to answer this question accurately, and here I’m going to try and dispel the myth around the “5 user rule”.

The problem with the Nielsen 5 user rule is that it’s only true in very specific circumstances, and even then it is pushing the boat out a fair bit. The Nielsen rule says that:

N(1-(1-L)n)

where N is the total number of usability problems in the design and L is the proportion of usability problems discovered while testing a single user. The typical value of L is 31%, averaged across a large number of projects they studied.

“As you add more and more users, you learn less and less because you will keep seeing the same things again and again. There is no real need to keep observing the same thing multiple times, and you will be very motivated to go back to the drawing board and redesign the site to eliminate the usability problems. After the fifth user, you are wasting your time by observing the same findings repeatedly but not learning much new.”

A lot of people completely ignore the following few lines in the paper though:

“This formula only holds for comparable users who will be using the site/product in fairly similar ways”

Now that really narrows things down, and means that in many cases, the rule is not applicable.

How many is enough then?

It all depends on:

  • The results you are observing
  • The range of target user groups
  • The range of tasks that you want to observe
  • What the results are going to be used for
  • How many rounds of testing you intend on doing
  • Whether your results are statistically significant
  • The mission criticality of what you’re testing
  • And a whole lot more…

Laura Faulkner conducted a good in depth study of the 5 user rule and shared the results in her paper “Beyond the 5 user assumption: benefits of increased sample sizes in usability testing”. She proved that confidence intervals in samples of five users were around 18.6%…ouch!

“Although practitioners like simple directive answers such as the 5-user assumption, the only clear answer to valid usability testing is that the test users must be representative of the target population. The important and often complex issue, then, becomes defining the target population. There are strategies that a practitioner can employ to attain a higher accuracy rate in usability testing. One would be to focus testing on users with goals and abilities representative of the expected user population. When fielding a product to a general population, one should run as many users of varying experience levels and abilities as possible. Designing for a diverse user population and testing usability are complex tasks. It is advisable to run the maximum number of participants that schedules, budgets, and availability allow. The mathematical benefits of adding test users should be cited. More test users means greater confidence that the problems that need to be fixed will be found.”

Look at Figure 1 in her paper, and submit it to your memory for keeps:

Laura Faulkner’s graphs

Do your own tests, and measure things accurately. You don’t need to have statistically significant results ot high confidence intervals for everything you test. The decider for me personally on how thorough I decide to be is risk. If it’s something I can fail on and fix quickly, then I’ll take a gamble. If it’s something I need to be sure about, I’ll go to town.

Why it helps to have a hypothesis…

Have you ever found some weird “truths” in your data that feel wrong? If you don’t start with a hypothesis, you will find patterns that are meaningless although they are there. This statistical phenomenon is called “Clustering”. In large random datasets, you’ll find clusters of the same type of information, because there are lot more clustered patterns out there than non-clustered ones. It doesn’t mean that you have necessarily found anything meaningful. For example you data may show that all dentists have red shoes, or all runners are capricorns. This video explains it nicely and I highly recommend the book and website.

Where to start with stats

It’s good to understand stats if you’re going to be measuring pretty much anything. If you don’t, you may be misled by what you are seeing. You should understand the basics at least. A great book is Measuring the user experience and another is Quantifying the user experience.

Did you enjoy this post? Want more of the same? Consider following Designing Atlassian!

--

--

MC Dean
Designing Atlassian

Head of Product @The Mintable | Designer | Maker | Meditator