Determining Sample Sizes for A/B Testing Using Power Analysis

Michael Zhang
The Startup
Published in
9 min readJul 6, 2020

--

Source: Optimizely

A/B tests are ubiquitous today as a tool for designing just about anything. They are often looked to as a more objective way of answering the age-old question of, “which is better?” by quantifiably measuring success. In my eyes, the process of conducting an A/B test has always epitomized the “science” in data science. It shares many elements with randomized controlled trials (RCTs), which are not only the gold standard of clinical trials, but also psychology laboratory experiments, by creating a setup designed to isolate and measure the effect of changing what one group sees vs. another.

However, the question of sample size (i.e. “how many people do I need to test on?”) is frequently asked, as testing takes time, and time is money. Ideally, as soon as you have a confident answer to your question, you can choose the winner and stop using the inferior version. By using a power analysis to determine sample size, you can get a better sense up front about how long a test will need to run before it can confidently confirm or refute your hypothesis.

What is a power analysis?

In order to determine the minimum sample size required for running a statistically robust test, an a priori power analysis can be done. A priori refers to the fact that the analysis is occurring before…

--

--