Experiment Versioning

Published in

Walmart Global Tech Blog

3 min readDec 27, 2018

In On Developing Expo, we discussed the A/B testing process at Walmart and our platform. This article is one of the series to dive more into what we built and why, how it was built, and how it is used.

Experimentation Versioning allows experimenters to mark changes in a test so that different changes that impact the results are defined and results for each version are captured separately.

The Motivation

As an experimenter, you probably often find yourself in the following situation: You setup an experiment and gave each variation a percentage of traffic and started the experiment. After it’s been running for a while, you find out that you need to make a change to your experiment. For example, you need to fix a bug or change the traffic allocation. But the experiment has already started! Although our platform allows changing traffic allocation, adding and removing variations, or changing treatments in a variation after the experiment has already been started, such changes will typically make it invalid to analyze the results across that period. How can you easily identify when exactly the change was made, or view results in isolation of that change?

The Old Way

Users were using the following hack to achieve the results mentioned above. Every time the experimenter wanted to make a significant change to an experiment and easily identify this change in the experiment results, they must:

Clone the existing experiment to a new experiment
Stop the existing experiment
Make some manual and painful changes behind the scenes on the new experiment to make sure the user assignment stays consistent with the old experiment.

When reviewing results, the experimenter can only compare results with the older version by flipping back and forth between experiments. As you can see, this “hack” contained a lot of error prone manual steps and the end results were not obvious. It was also an extremely painful process.

The New Solution

With experiment versioning, the process is simplified. Here is how we do it.

All experiment starts with version 1 by default.
A running experiment can be restarted with a new version, including a description of the change(s).

3. The experimenter can easily see the version changes and it’s details in the revision history of the experiment.

4. The data pipeline for results reporting is aware of the versioning. This allows for viewing experiment results by version, which is an important aspect of this feature.

As you can see, experiment versioning allows us to eliminate the manual painful “hack”, instead providing us the capability to do this in a much more streamlined and organized way.

Conclusion

In a nutshell, “Experimentation Versioning” means

An experiment can be broken down by multiple versions, each with a start date and an end date and their own isolated data.
The assignment, treatment injection and other experiment data are maintained.
Experimentation Versioning allow experiment results to be split in order to isolate changes.

By introducing Experimentation Versioning, the analyst can easily make changes to the running experiment without interrupting the assignment and treatment injection. Experiment results are captured separately for each version, allowing the experimenter to easily identify those changes and their impact to the results, helping for smarter data-driven business decisions.

Experiment Versioning

The Motivation

The Old Way

The New Solution

Conclusion

Written by Jean Pan