A/B Test or Rollout? Making The Choice With Analytics

Kelsey Pericak
Super.com
Published in
4 min readMar 21, 2022

Our Analytics team is heavily embedded in the design, implementation and evaluation of both experiments and rollouts at Snapcommerce. By using these strategies, we help our customers save on travel and e-commerce with confidence. When we brainstorm a new testing scenario, or decide to expose a new product or feature to our consumer-base, we first need to answer this question: How? It is a common misconception that a “rollout” and an “A/B test” are the same. However, these two analytically-backed strategies have distinctly different use-cases and benefits. Selecting the appropriate strategy is key to achieving successful and insightful results. In this piece, I will talk about the differences between a rollout and an A/B test, and briefly explain how Data Analytics plays an active role in conducting these methodologies.

Choosing a rollout

A rollout is the incremental introduction of a product to a targeted user-base. Typically, a rollout begins by presenting a product to a small group and then scaling-up its exposure. The product can be in minimum viable state (MVP) or fully fleshed out; this will depend on the formality required and the initial user-base targeted. An MVP may be used when exposing a product to employees first, as the repercussions of testing internally are limited. The ultimate goal of this strategy is to present a functional, performant and good quality product to all potential users. A rollout is often used to introduce a previously secretive product to the market, or to iron-out any uncertainties before conducting a full-scale launch.

Rollouts can occur in any industry. For those who’ve enjoyed playing video games before launch, you’ve likely heard of theses familiar rollout stages:

  1. Technical test
  2. Alpha
  3. Beta
  4. Launch

For our goods vertical, a rollout may look like the incremental onboarding of new suppliers to gradually increase the SKUs that we sell. This would allow us to observe consumer demand, test technical feasibility, and progressively develop supplier relations.

If you want to realize any of these benefits (see below), then you should choose a rollout:

Rollout benefits, not specific to an industry or scenario and not comprehensive

Upon selecting a rollout, these are some questions to consider:

  • What is your desired time horizon for launch? In which scenarios would you alter your launch timeline? For example: What may cause you to pause or stop the rollout, and can quantitative alerts be set up?
  • Who should be exposed to the product first?
  • Are there any milestones that you would like to achieve throughout the process, and can those milestones be tracked?

Experimenting with an A/B test

An A/B test compares two similar groups, usually named “control” and “treatment”, to validate or reject a hypothesis about the performance of those groups when exposed to two different products or actions. Less commonly, an A/B test could also be the comparison of two different groups when exposed to the same product or action.

An A/B test is appropriate when trying to solve an unanswered question using real-life data. It is typically designed using two randomized groups of users that share similar characteristics, and evaluated to answer a hypothesis that is derived from business intuition or historical performance. Some examples of A/B tests would be: comparing the open rates of two emails with different subject lines, comparing the average lifetime value of customers who are exposed to different marketing tactics, and comparing the net revenue of two advertisements for the same hotel that have different images. A significance test, such as the t-test or the chi-squared test, can be used to evaluate the success of an A/B experiment.

If you want to realize any of these benefits (see below), then you should choose an A/B test:

A/B test benefits, not specific to an industry or scenario and not comprehensive

In preparation for an A/B test, here are some questions to answer:

  • Will a tool be used to randomize and define the user groups, or will you need data analysts, scientists or engineers to build the target groups more manually ahead of time?
  • Is your hypothesis logical and forward-looking? For example: Is your test being conducted with the intention of improving something, or are you expecting to see negative performance from the treatment group? If negative, then can you use a smaller sample size to derive significant results?
  • Will this test implicate the performance of other, ongoing tests? Who should be aware, internally, of the experiment?

Why the confusion?

While these two strategies may seem vastly different when written out, it is fairly common to see people compare the user-base of a rollout at phase 1 (start of rollout) to the user-base that is not exposed to a rollout at phase 1. This is done using the, usually incorrect, assumption that those two groups could qualify as treatment and control. Rollouts often target certain geographies or customer segments instead of randomly selecting customers for initial product exposure. It’s not plausible to conclude that differences between two dissimilar groups are related to a rollout when those groups could inherently have different reactions when exposed to the exact same product. You’ve likely seen these A/B tests attempted in hindsight (after something was already implemented), and those analyses could lead to false and/or misleading insights.

We always scope the “why” and “how” before choosing one method over the other. Selecting the correct strategy will lead to more actionable and trustworthy results.

With a test-and-learn culture, Snapcommerce is running A/B tests frequently. These experiments have influenced our high growth performance. As our company expands its offerings, we are also rolling out new products and features to share with our customers. Working cross-departmentally, our analytics team partners with domain stakeholders on the product, growth, and supply teams to make real, impactful improvements that are hypothesis driven and analytically supported.

For more articles about technology, visit the Snapcommerce Medium homepage.

--

--

Kelsey Pericak
Super.com

Director of Analytics & Data Science | Master of Management in Artificial Intelligence