Choosing the right sample in Quantitative UX research
Contributed by Kiranraj Govind — User Research Manager Khatabook
Across my experience as a researcher in consumer research and UX research, one of the questions I have consistently faced is How big should my sample size be? The answer, in most cases, is complex. For discussion’s sake, let us focus on quantitative research.
In Quantitative research, determining sample size is very important. In recent years, arriving at sample sizes has been significantly convenient because now there are a lot of online sample size calculators which can tell you how much you should sample. Like these: Sample Size Calculator
While these tools serve well in most cases, if you want to build a basic thumb rule to arrive at sample sizes, please read on.
Sample Selection in Quantitative UX research:
The gross generalization of N-30 is adequate:
There is absolute consensus in the research fraternity that a sample of 30 is statistically significant to read the data. There is some truth in this but also some gross generalization. The sample of N-30 is based on Central Limit Theorem states that the sampling distribution of the mean will always follow a normal distribution under the following conditions: The sample size is sufficiently large. This condition is usually met if the sample size is n ≥ 30. If you talk to anyone who has studied statistics, they will tell you that while there is some truth here, if you decide to read data on a base of 30, that is a mistake. 30 is a very small base and often can lead us to come out with incorrect conclusions.
Ideal Sample Size, Confidence levels, and margin of error:
Statistically speaking, a sample of 385 would result in a confidence interval of 95% and a margin of error is 5%. Hence if you want to give an ideal answer, this is it. Across my experience as a researcher, there are always constraints in terms of time and resources. Hence when we decide on sampling, we may not be able to use the optimal sample size. I have always recommended my clients and organization have a base of at least 200 when there have been constraints for time and resources. When the sample is N=200, data, to a large extent, remains stable. Now in many instances, you would be requested that there is also a need to look at a specific cohort/sub-group within the population. In those cases, to look at any sub-group, you should have a minimum sample of at least 100. Nielsen Norman, in this article, has explained the concept of margin for error and confidence interval please in a very simple and concise manner.
What is the maximum sample size we should cover?
Typically you would have heard that the larger the sample size, the more robust the data. This is true, but we also need to look at this statement more realistically. Covering larger sample sizes means more time and more cost. Also, the question is the incremental sample you are achieving results proportional increase in the quality of the data. For example, I ran a survey with n-1000 and another with n-5000. Assuming all aspects in both the studies are exactly the same, the data is likely to be very similar, and if any, there would be a difference that is unlikely to be statistically significant. Hence as a thumb rule, I would suggest that if you have resources and time then N-1000 is the sample that you can look at. A sample of 1000 in most cases would suffice all requirements.
The above is applicable for simple surveys where we are following random sampling. For other kinds of specific research like pricing, census, forecasting, and Surveys where different forms of multivariate analysis are used, the sample requirement would be different.
There will always be researchers who can understand and apply statistics better than most others. Still, for the rest, the above suggestions and understanding will work fine in most instances. So you can relax and stop worrying!