A/B testing with style

Martin Höller
Bleeding Edge
Published in
4 min readJan 5, 2018

How to implement a maintainable A/B testing system in a mobile app

Illustration by Marta Pucci

In an “A/B test,” two or more variants of a feature are tested at the same time by putting users in test groups. As an example: At Clue, we might want to test different variants of the app’s onboarding experience to see which one performs better for new users and leads to more user accounts created.

Implementing such a test setup usually involves an online service (e.g. FireBase, Leanplum, Optimizely) where the experiments can be designed, monitored, controlled, and analyzed. You might also roll your own A/B testing infrastructure (hint: don’t). The mobile client then connects to this remote service and requests the current user’s test group for each experiment.

Stinky code 💩

Running and maintaining A/B tests can be a messy business — and it sure was in our code base. Some of the issues we had were:

  • New experiments were usually just piled on top of the existing ones in a single Objective-C class.
  • No structure, no proper architecture, no tests.
  • Completed experiments were not removed from the code base.
  • A/B tests could have nasty side effects.
  • It was not very obvious which A/B tests were still active and relevant.

On top of all of that, we used the SDK of our remote configuration service in the wrong way. In short, it was a horrible mess with awful code smell.

A new hope

When the time came to add another A/B testing experiment, I considered the possible options:

  1. Add the experiment to the old ExperimentsHelper class as we did before
  2. Refactor the existing code to improve it a bit
  3. Rewrite the whole thing properly

In our effort to constantly improve and modernize the code base, it was a good opportunity to do a proper rewrite in Swift.

Goals 🎯

The new implementation would need to be:

  • Maintainable
  • Testable
  • Have no direct dependency to any specific service
  • Have the ability to add and remove experiments easily
  • Have a way to customize behavior for specific experiments if necessary

Architecture 🏗️

The ExperimentsManager Class

The main touchpoint for A/B test experiments is the ExperimentsManager class. Its sole purpose is to register experiments with a backing store and to retrieve these experiments when needed.

Experiments are defined as concrete implementations of the ABTestExperiment protocol and are referenced by a string identifier (variableName). Instances of ABTestExperiment are simply cached in a dictionary for quick retrieval. Both ABTestExperiment and RemoteConfigurationStorable being just abstract protocols makes the whole architecture independent of any concrete service or implementation. This also makes the business logic of registering and retrieving of experiments easily testable by using mock implementations.

Remote configuration

The remote configuration protocol contains only two functions, one for registering a variable and one for retrieving its value:

An implementation for a specific service then only needs to implement those functions for registering and retrieving experiments. The rest of the business logic does not care about the nitty-gritty details.

ABTestExperiment

The protocol for a generic A/B test experiment looks as follows:

An experiment must always have a defaultValue. This is used, for instance, when the app is first launched without an internet connection and the user can’t be put into a specific test group, which would override the experiment’s actual value.

The property currentValue stores the actual value of the experiment after it has been registered with a remote service and the user has been put into a test group. This is an optional property because the device may not get back a value from the remote service.

Finally, an extension of the ABTestExperiment protocol adds some typed getters to make the usage of experiment objects more convenient.

Let’s experiment 💡

Now we have all the necessary building blocks for running some A/B tests.

As stated above, A/B test experiments are represented as concrete implementations of the ABTestExperiment protocol. Let’s come back to our initial example of A/B testing the onboarding experience. A very simple implementation could look like this:

All experiments are registered at the beginning of the app’s lifecycle, e.g. in the app delegate’s didFinishLaunching method:

Somewhere in AppDelegate.swift

When it is time to show the onboarding, we need to decide on which variant to use:

Nice! Asking the ExperimentsManager for an experiment’s value is only one line of code.

Custom behavior

Because each experiment is implemented in its own class or struct, it is very easy to add additional behavior or functionality specific to this use case. Let’s assume the onboarding’s title string is different, depending on the user’s test group. We could add a new convenience function to OnboardingExperiment:

Removing experiments

After the onboarding A/B test has been conducted and we have found the best performing variant, its code can be removed from the project. For that, it is only necessary to remove the OnboardingExperiment class and all its usages, leaving the winning code path in.

Conclusion

The new implementation has some major advantages over the original one.

  • Business logic in ExperimentsManager never needs to change when switching the remote A/B testing service.
  • A/B test experiments can have additional or specialized behavior that can be unit tested.
  • Removing a finished A/B test is just a matter of deleting the class and see where the compiler fails.

--

--