10 Tips about A/B Testing Experimentation

Zen Liu
Agile Insider
Published in
7 min readSep 19, 2023

Back when I first started out in product management in Gojek Singapore, I was thrown into the deep end of the pool, so to speak. A/B testing was more of a buzzword than a practice to me. I vividly remember the excitement of rolling out my first big feature and having solid, data-driven evidence to validate its impact. That’s when A/B testing shifted from being just a buzzword to a critical tool in my toolkit.

Yet, as with all tools, its effectiveness can be limited if it’s not communicated the right way, used in the right manner, or applied in the appropriate context. Just as a potent spice can either elevate a dish or ruin it based on its usage, A/B testing can be the game-changer or a misleading compass based on its application.

Through this article, I’ll be sharing the lessons learned from the trenches. Whether you’re just getting acquainted with A/B testing or you’re looking to refine your expertise, these tips will guide you away from common pitfalls and bolster your impact. Dive in, and together, let’s transform theory into actionable wisdom.

Here are 10 tips about A/B testing experiment that every #productmanager #productdesigner #marketers #dataanalyst #datascientist must know.

#1 Ceteries Paribus

The Latin term “Ceteris Paribus” might seem arcane at first glance, but its application in A/B testing is profoundly practical. Translating to “all other things being equal,” it serves as a foundational principle in experimental design.

Imagine testing two variations of a webpage — one on a lazy Sunday and the other during a frenetic Black Friday sale. The varying external conditions would make the data from these two tests incomparable. Herein lies the value of ceteris paribus. By ensuring that all variables remain constant, except the one under scrutiny, the integrity of the experiment remains intact.

This consistency provides a controlled environment, making it easier to attribute any differences in results directly to the variations themselves, rather than some external, uncontrollable factors.

#2 Randomization:

“Randomization” may sound like a technical jargon tossed around by statisticians, but its role in A/B testing is pivotal.

At its core, randomization is about ensuring that test participants are allocated to different groups (like Version A or Version B) purely by chance, making these groups statistically equivalent. This unbiased assignment is crucial because it controls for confounding variables — those pesky external factors that could inadvertently influence results.

Without randomization, you risk introducing selection bias. Imagine if tech-savvy users predominantly see Version A while less tech-savvy users see Version B. Any outcome differences could be due to the users’ tech proficiency rather than the changes you’re testing. Such a bias defeats the entire purpose of A/B testing.

#3 Statistical Significance

Statistical significance does not mean practical significance. Just because a difference is statistically significant does not mean that it is large or important. It just means that the observation is not due to random chances.

Imagine you find a new route to work that’s statistically significantly faster by 30 seconds. While this is technically faster, does those 30 seconds make a big difference in your daily life? Probably not. That’s the difference between statistical and practical significance.

Always ask yourself, beyond the stats, what’s the real-world impact?

#4 Frequentist Approach

With the Frequentist approach to A/B testing, one way to communicate the result could be

Statistical: “After running the test for 2 weeks, we saw that Version B had a bounce rate of 3% while Version A got 6.5%. As the result is stats-sig, it is likely not due to random chances.

Implications: “This means that if we implement Version B, we could potentially reduce the bounce rate by more than half. Given our average traffic, this could translate to keeping thousands more users engaged on our platform."

#5 Bayesian Approach

With the Bayesian approach to A/B testing, one way to communicate the result could be

Statistical: “There is a 85% chance that Version B has a 8% lift over Version A. This means that Version B is likely to have a higher engagement rate compared to Version A.”

Implications: “If we look at our average engagement metrics, this 8% lift could translate to more users interacting with our platform, leading to a potential increase in revenue and user retention.”

#6 Don’t use it to win an argument

The goal of A/B testing is to improve the performance of your product. It is important to be objective when analyzing the results of the test and to be willing to change your hypothesis if the data does not support it.

Some of the most groundbreaking discoveries in product management have come from unexpected A/B test results. If one always expects the data to align with their hypothesis, they close themselves off to these potential revelations.

Being open to surprises means acknowledging the possibility of being wrong and embracing it. It’s in these moments of unexpected insight that true innovation often occurs.

#7 It is expensive

The cost of A/B testing isn’t just monetary; it’s a holistic investment. Let me break it down:

Time: From setting up the hypothesis to analyzing results, A/B testing isn’t something you can rush through. Precision takes time, and shortcuts often lead to unreliable results.

Resources: You need a decent infrastructure to run tests, especially when you’re dealing with high volumes of traffic. Plus, if you’re testing a feature that requires development, then there’s the cost associated with engineering and designing the tests.

Talent: It’s not about just setting up a test and hoping for the best. Interpreting results, understanding the nuances of statistical significance, and even setting up tests in the right way requires specialized skills. Whether it’s a data scientist or an experienced UX researcher, the human expertise isn’t cheap.

Software and Services: While platforms like Optimizely and VWO have made A/B testing more accessible, they come with their own costs. Sure, they provide a fantastic suite of tools and dashboards to make testing simpler, but remember, quality comes at a price.

#8 It can be time-consuming

A/B testing requires running the test long enough to get reliable results, but not so long that you miss out on making timely decisions. There’s an inherent opportunity cost. The longer you run a test, the longer it might take to implement beneficial changes or rectify potential issues. Delayed decisions can lead to lost revenue, reduced user engagement, or other missed opportunities.

It is important to be strategic about when and how you use A/B testing.

  • Prioritize High-Impact Changes: Focus on testing elements that are likely to have a significant impact on key metrics. Minor cosmetic changes, for instance, might not warrant a full-scale A/B test unless they directly tie into the user experience or conversion metrics.
  • Limit the Number of Variants: More variants can mean longer testing times. If you’re testing multiple versions, ensure there’s a compelling reason to do so.

#9 It is not a silver bullet

A/B testing shows you what happened, not why. You need to dig deeper, talk to users, or combine with other research methods to get the full picture. Here’s where methods like user interviews and usability testing come into play:

  • User Interviews: Direct conversations with users can unveil their motivations, frustrations, and needs. This feedback can help explain the results from an A/B test.
  • Usability Testing: Observing users as they interact with a product can highlight areas of confusion or friction that might not be evident from quantitative data alone.

By blending quantitative data with qualitative insights, product teams can create more user-centric solutions and make informed decisions that drive genuine, lasting improvements.

#10 It is not an excuse to deploy low-quality products ;)

A/B testing should be used to test hypotheses and improve the user experience, not to release features that are not ready for prime time.

Deploying low-quality products under the guise of A/B testing can have long-term costs:

  • Brand Reputation: First impressions matter. Introducing subpar features can tarnish the brand’s image and reputation in the market.
  • Data Skew: Poorly executed features can lead to misleading A/B test results. For instance, if a feature has a major usability flaw, the test won’t accurately reflect its potential impact until the flaw is corrected.

A/B testing complements the product development process — it doesn’t replace other stages. Initial ideation, design, development, and QA are all integral phases that should precede A/B testing.

--

--