Acing the A/B Testing

Published in

Products, Demystified

8 min readJan 9, 2022

Product features take days, even months to get decided upon and developed; product teams usually spend an enormous amount of time in deciding what features to even go forward with, aligning with the company’s overall strategies and further collaborating with various stakeholders to build them. User research, ideating, designing and development — all are important and time-consuming phases that teams tick off on their checklist while working on a product. However, sadly, the work of product managers do not end at the development phase, it’s here when their goal changes from developing a product to rolling out an efficient, bug-less and frictionless product (or feature) for their target user, that solves what it intends to solve.

Since the user needs to be kept in focus at all times by product teams, A/B Testing efficiently does the very same — it compares apples-to-apples of a single variant of a feature, to ensure their results reflect how actual users respond specifically to that specific feature.
A/B testing has been known to be used by various teams like marketing, web development, quality assurance, and so on, however, our focus would be solely on how product managers employ the same in their day to day life.

What is A/B Testing?

A/B testing (also known as split testing or bucket testing) is a method of comparing two versions of a feature against each other to determine which one performs better. A/B testing is essentially an experiment where two or more variants are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

It involves splitting the user base into two groups. This can be done by running a 50/50 split test if the user base is currently very small, or taking a sample size from a larger user base.

The two groups then receive two versions of a feature. It could be something as small as a different colour of a button, or it could be an entirely revamped version of the checkout on the application.

After a certain amount of time (which will differ depending on the test/product/goal etc) the product team looks at the data gathered to better understand which version should be rolled out to the entire user base.

Why is it important?

1. Solve user pain points

Users come to the product to achieve a specific goal that they have in mind. Whatever the goal may be, they may face some common pain points while achieving their goal. It can be a confusing copy or hard to find the CTA button like buy now, request a demo, etc.

Not being able to achieve their goals leads to a bad user experience. This increases friction and eventually impacts your conversion rates. Using data gathered through user behaviour analysis tools such as heatmaps, Google Analytics, and website surveys to solve their pain points, is the correct way to go about it, and then working upon improving the experience further.

2. Get better ROI from existing traffic

The cost of acquiring quality users is already extremely huge. A/B testing lets product teams make the most out of the existing users and helps to increase conversions without having to spend additional money on acquiring new users. A/B testing can give high ROI as sometimes, even the minutest of changes on the product can result in a significant increase in overall conversions.

3. Reduce bounce rate

One of the most important metrics to track to judge is bounce rate. There may be many reasons behind your high bounce rate, such as too many options to choose from, expectations mismatch, confusing navigation, use of too much technical jargon, and so on.

Since different products serve different goals and cater to different segments of audiences, there is no one-size-fits-all solution to reducing bounce rate. However, running an A/B test can prove beneficial. With A/B testing, teams can test multiple variations of an element of the entire product till they find the best possible version that works. This not only helps them find friction and user pain points but helps improve the users’ overall experience.

4. Make low-risk modifications

Make minor, incremental changes to the product with A/B testing instead of getting the entire product redesigned can reduce the risk of jeopardizing the current conversion rate.

A/B testing let teams target the resources for maximum output with minimal modifications, resulting in an increased ROI. An example of that could be product description changes. The team can perform an A/B test when they plan to remove or update the product descriptions. They do not know how the users are going to react to the change. By running an A/B test, they can analyze their reaction and ascertain which side the weighing scale may tilt.

Another example of low-risk modification can be the introduction of a new feature change. Before introducing a new feature, launching it as an A/B test can help teams understand whether or not the new change that they’re suggesting will please the users.

Implementing a change on the product without testing it may or may not pay off in both the short and long run. Testing and then making changes can make the outcome more certain.

5. Achieve statistically significant improvements

Since A/B testing is entirely data-driven with no room for guesswork, gut feelings, or instincts, teams can quickly determine a “winner” and a “loser” based on statistically significant improvements on metrics like average session time, conversions, average order value, and so on.

How to Run A/B Tests?

Stage 1: Determining the data that can be captured

First, determine what types of information teams will be able to collect and analyze, before building the experiment and running the tests.

Stage 2: Developing the hypothesis

Based on the data the team will have available, product manager will now want to identify the opportunities for the experiment and formulate a theory about how users will react to a specific element of the product.

For example, PMs might assume that users will want the steps required to complete a task using the new feature to be ordered in a particular sequence. That’s the hypothesis.

Stage 3: Building the experiment

Now PMs need to develop the details of the test. For this, teams will have to create a variant of the planned feature — using, for example, the same functionality but with the steps sequenced differently.

During this stage, PMs will also need to generate the different segments of the user base that will receive the variants of the new feature. They’ll also want to define the metrics they’re going to measure.

Stage 4: Running the test

Now it is time to send the different versions of the new feature out to the various user segments and wait to see how the groups respond to each version.

The team will need to determine for themselves how long to run your A/B test, how much data to collect, etc. — because this will vary for each company and because they want to gather and analyze enough data that they know they’re working with a statistically significant representative sample of the user base.

Stage 5: Measuring the results.

Finally, PMs will have to review the data that has been collected from the A/B test and decide about which of the two features earned the most positive response or the greatest degree of engagement from the users.

Netflix perfecting A/B Testing

Netflix is well known for being a great A/B testing playground! Every single feature has been tested rigorously, and users’ own feed is constantly running experiments. For example, users may have noticed that titles are displayed with different images. One day it’s Eleven in the preview for Stranger Things, and the next it’s Steve & Nancy. That’s the algorithm figuring out which character users are most likely to click on.

Various artworks of Stranger Things on Netflix for different set of users

If the user likes movies with strong female leads, the algorithm learns to show similar movies and TV Shows, and to always highlight the female talent in the content.

Netflix has a long history of running A/B tests. And these tests often reveal surprising information that the teams wouldn’t have anticipated.

Gibson Biddle, former VP of Product at Netflix, spoke at #ProductCon on how Netflix uses testing to find new ways to delight customers. In the early days when they still rented out DVDs, customers cited having to wait for the latest release as their main point of dissatisfaction.

Netflix hypothesized that if they solved this problem, user retention would increase. They rolled out a very expensive test in order to see what would happen if they gave customers exactly what they wanted. It seems like a no brainer — give the people what they want and they’ll be happier. It turns out, spending millions to give the people what they want only increased retention by 0.05%, about 5000 users at the time.

$1 million to rollout, to save 5000 users, with a lifetime value of roughly $100 each. Now that’s a no brainer!

For further reading on what A/B testing tools are available in the market, you can refer here — https://blog.hubspot.com/marketing/a-b-testing-tools

Here’s an amazing playlist on YouTube, if you want to learn more on A/B Testing — https://www.youtube.com/watch?v=z5ksWcukD9Y&list=PLEXcbK4FvkxHdNU8NhSFcw_pr6iDFWbka