What Is A/B Testing?

Çağatay Tüylü
Çağatay Tüylü
Published in
3 min readJun 21, 2021

--

A/B testing compares two versions of a web page or app against each other to determine which one performs better. AB testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

The days where marketers made changes to their websites based on gut feelings alone are far behind us. We are now deep in the A/B testing era, basing our decisions on as much empirical data as possible.

To enable that, the community has looked for A/B testing tools to help us make informed decisions based on collected data. Or, in more exact terms: to soundly generalize from observed data and gain insight into the future.

This post aims to discuss the evolution these tools are currently undergoing, from the basic “frequentist” testing method used in the past (and still commonly used today) to the new Bayesian testing method which the industry is moving toward.

What is Hypothesis Testing?

At the dawn of the A/B testing, statisticians provided a very basic framework for statistical inference in an A/B testing scenario. Commonly known as “Hypothesis Testing,” the procedure goes as follows:

  • Start with the existing version of the tested element within it. That existing version is now termed the “baseline” (or variation A).
  • Set up the alternative variation, the “treatment” (or variation B).
  • Calculate the required sample size. This calculation is based on the baseline’s current conversion rate (which must be already known), the minimum difference in performance you wish to detect, and the desired Statistical Power.

Now that we have the samples, we can observe the performance of each variation and calculate whether the stronger performing variation is, in fact, better than its competitor in a statistically significant fashion. Again, a calculator such as this one may help, emitting the p-value, a.k.a “confidence.” But, what does p-value actually stand for?

How Does A/B Testing Work?

Let’s say your metric is the number of visitors who click on the button. To run the test, you show two sets of users (assigned at random when they visit the site) the different versions (where the only thing different is the size of the button) and determine which influenced your success metric the most. In this case, which button size caused more visitors to click?

In real life there are lots of things that influence whether someone clicks. For example, it may be that those on a mobile device are more likely to click on a certain size button, while those on desktop are drawn to a different size. This is where randomization can help — and is critical. By randomizing which users are in which group, you minimize the chances that other factors, like mobile versus desktop, will drive your results on average.

There will be a lot of example like that for different problems, your website, company or on your new project. We always want to measure something and the AB Test is one of the most used techniques with statistics in it.

How Do Companies Use A/B Testing?

A/B testing is now used to evaluate everything from website design to online offers to headlines to product descriptions.
Most of these experiments run without the subjects even knowing.

And it’s not just websites. You can test marketing emails or ads as well. For example, you might send two versions of an email to your customer list (randomizing the list first, of course) and figure out which one generates more sales. Then you can just send out the winning version next time. Or you might test two versions of ad copy and see which one converts visitors more often. Then you know to spend more getting the most successful one out there.

--

--