When A/B Testing is Counterproductive and Why it Matters

Julius Uy
Big O(n) Development
5 min readApr 24, 2022
Suffrin Succotash 🦇 Batman! Even Daffy Duck can’t make sense of this one!

A mere observation of the product management zeitgeist today quickly reveals that there is no shortage of information about why A/B testing is important and how one can use them. In fact, entire companies such as LaunchDarkly, Growthbook, A/B Tasty, and several others came out as a result of the number of tech companies in need of A/B testing solutions.

That Duolingo, Twitter, Facebook, and Netflix have been exceptionally intentional in running their tests further blankets the industry of the primacy of A/B testing. Yet one must wonder at which point A/B testing really drives value to the company and at which point one should perhaps hold his ground and reconsider its value. This article is about that.

What is A/B Testing

Just kidding. You don’t need this if you are reading this article. If you really don’t know what A/B testing is, just read this and come back.

The (Invisible) Cost of A/B Testing

Harry Potter A/B Testing his body

Believe it or not, running an A/B test is ridiculously expensive.¹ While Ronny Kohavi famously noted that 60–90% of good product intentions do not directly contribute positively to the product (and therefore A/B testing is here to save the day), one might also add that the cost of running one may actually outweigh the benefit one hopes to achieve.

Here’s what happens when you run an A/B test

  1. Someone has a hypothesis he wants to validate. This can come in various forms of product direction disagreements (which is especially common in startups) to simple due diligence (such as in Big Tech)
  2. The product manager defines the part of the product to A/B test and the metrics to collect.
  3. The initiative is then briefed to the software engineers who must then throw in the code to make that happen, including tracking.
  4. The QA team is to validate that the A/B test is developed correctly, including the tracking.
  5. The app is then launched and the experiment begins.
  6. The product manager runs the test and waits for a few days. (Or weeks)
  7. After reaching statistical significance (and many a time you don't), the data analyst then reviews the data and makes sure the edge cases are covered before delivering his report.
  8. The product manager reviews the test and decides what direction to take the product.

This comes under the assumption that there were no bugs in the tracking nor in the implementation of the product feature. Otherwise, the cycle repeats.

Now let’s say I want to check which variant of the text should apply in the following example:

Holy Smoly 🦇 Batman, I saw this image somewhere and they’re all looking up like the moon is a disco ball

Here’s the rough cost breakdown for the A/B test.

If that raises doubts, $3,200 is 6 months worth of food budget in Singapore

The man-days may look bloated at first until one begins to consider the time for the staff to do up the requirements, have clarifications, meetings, debugging, testing, communicating to the higher-ups, ensuring data cleanliness, and so forth. These are costs that are usually overlooked in estimates.²

Is It Worth It?

Almost every management decision is an attempt to do constraint programming, which is why operations research is a vastly underappreciated field yet a nascent awareness of it in everyday decisions helps the decision-maker improve the quality of his decisions.

A whole new definition of Krabb Mentality!

A senior enough person with enough experience in the industry will always answer questions like this with “it depends”. Indeed, it depends. If the hypothesis is strong enough that a few percent improvements in conversion can increase revenue by some percentage points which may, over time, pay off the cost incurred by the A/B test, by all means. Yet many a time in discussions and disagreements, many inexperienced executives, managers, and ICs beeline into doing A/B testing because we ought to do Data-Driven Decision Making and shut the HiPPO up.³ Fair point. yet there’s more to it than what meets the eye.

Sometimes, NOT being data-driven enough can actually help the company more because collecting the data may end up costing more.⁴ One must try to weigh the consequences such that he doesn’t get stuck with a simple, trivial issue and focus on the big-ticket items. That goes to say that many a time, doing A/B testing is detrimental.

How Then Must We Think?

We must therefore consider the cost to reward ratio of doing A/B testing. The books and the blogs and the gods can repeat themselves till kingdom come, yet the fact remains that your context is different from theirs. Many words were said, but so much more words were left unsaid. These are the contexts that applied to them but not to you. In the end, it’s all about constraint programming. If you had that resource, would you spend it for A/B testing? Or would you spend it elsewhere?

Beware of Best Practices. There are no best practices, only tradeoffs.

_____

¹ One might argue that mistakes are even more expensive. Maybe. But that doesn’t make A/B testing less expensive.

² Of course, your mileage may vary and the cost can be reduced in many ways. Some companies can squeeze a bit in man-days but pay their staff more. So it’s hard to put a finger on the actual cost. Yet you get the idea.

³ The HiPPO is the Highest Paid Person’s Opinion.

⁴ This one deserves a totally separate blog. To be clear, I am a strong advocate of data-driven decision-making. Yet this is not a Law of the Universe. There are always limits to business axioms because business in itself is not axiomatic by nature.

--

--

Julius Uy
Big O(n) Development

Head of Technology at SMRT. ex-CTO here ex-CTO there. On some days, I'm also a six year old circus monkey.