The Startup
Published in

The Startup

Android A/B testing Made Simple With Cadabra Library

Photo from by Brian Wangenheim

A/B testing complexity

Let’s start with the complexity associated with A/B testing before I offer the solution. So, what are the problems?

  • Async communications — the configuration of the experiment is often located at the server so the client needs to fetch it first, and this operation is asynchronous. That brings many questions: how do we show the UI? Do we need to wait? How do we check the latest configuration is loaded?
  • Multi-layer changes — with most modern apps adopting one or another form of layered architecture an A/B test touches almost every layer in the app: it needs to be received from the network, stored somewhere in persisted storage, tracked on the domain-logic layer, and displayed through UI layer. That’s too many places to make a mistake (e.g see this article).
  • Multi-experiment tracking — with more and more experiments added to the app it’s harder to track which experiments are present at all, which can be active at the time, and whether they interfere with each other.
  • Sunsetting — after the experiment is over one needs to go and clean up the code from all the layers affected at the previous step. Things like configuration models and resources often get forgotten along the way and stay in the code base for months after they aren’t needed anymore.
  • Boilerplate — UI changes may be as simple as a single button’s color change or as complex as a whole new layout, in both cases at least one if-else/switch-case is required. And all should be done in a way they won’t get lost during sunsetting.

Solution requirements

If we were given enough time to design a complete solution to the aforementioned problems how would it look like? Here is my proposed library requirements. (Scroll to the last paragraph to get straight to the code samples)

  • Fully synchronous experiment configuration fetching — most of the tools that define the cohorts for A/B not making the decision in real-time, and not delivering it instantaneously either, there is no point in trying to get the current state of the server-side config: whatever state was actual at the moment of app launch should be the state of the A/B test deeper in the app flow (BTW recommended config fetching interval for Firebase is 12h). That doesn’t just reduce the bandwidth usage and increases responsiveness, it eliminates tricky bugs caused by fetching the updated config while one or more experiments have already been shown to a user.
  • Ability to enable the experiment locally — A and B groups for an experiment should be defined in advance and stay intact (that’s the best way to keep them truly independently randomized) but the experiment itself may not be available yet to a user until a certain event happens. It’s then required to activate experimental UI A or B immediately after, ideally without network round-trip, or extra code that checks for experiment activation status.
  • Single point of configuration — to keep track of all experiments we need to keep them registered in one place so it’s easy to see what’s active now. We’ll also need this point to set the configuration for integration and functional tests.
  • Statically typed enum-like configuration—once the experiment is started the sub-parameter can not change otherwise we’ll get inconsistent data. That immutability, in turn, allows us to create a data-class with a complete set of required parameters for each experiment variant. The enum nature will prevent us from accidentally skipping one of the options within the switch or when statement. Once the experiment is over we can remove the class, and the code won’t compile until we properly remove all the places the experiment was affecting.
  • Automatic resources resolving — many experiments are as simple as showing a new layout for the same data. It would be much more convenient to provide two layouts like cart_screen_variant_compact / cart_screen_variant_verbose for an experiment that has variants “Compact” and “Verbose” and let the framework inflate proper layout automatically. And during the sunset phase, we can then remove all the resources with either _verbose or _compact suffix and be sure nothing unused is left.
  • Testability — we don’t want to see a random experiment’s variant during the testing phase, so there should be an easy way to launch any given variant, and it should be possible to mock the experimentation framework when needed.


With all these requirements in mind, I’ve created a library let’s see how that works (links to the library repo and maven below)!

Both layouts and content can be changed by Cadabra

A simple experiment with automatic resources resolution

All we need to do is

  • define an enum that extends theVariant interface
enum class AutoResourceExperiment : Variant {STRANGE, CHARM}
  • and register it like that
val ct = cadabra.getExperimentContext(AutoResourceExperiment::class)// use `_strange` by default and `_charm` if Variant CHARM is active
enum class FancyExperiment(
val screenLayout: Layout,
val screenContent: Content,
val itemsLimitPerPage: Int
) : Variant {
CHARM(Layout.WIDE, Content.COMPLETE, 10),
val aVariant = cadabra.getExperimentVariant(FancyExperiment::class)when (aVariant.screenLayout) {
WIDE -> setContentView(R.layout.activity_wide)
COMPACT -> setContentView(R.layout.activity_compact)
favoritesOnly = aVariant.screenContent == Content.FAVORITES,
numberOfItems = aVariant.itemsLimitPerPage

Experiments activation

For the real-life scenario, we’d need some service to control the experiments, like Firebase, which is supported out of the box

// register experiments without starting
// load experiments config from Firebase


The key design aspects of the library

  • It’s Kotlin-first but fully supports Java: all companion’s methods exported as statics, there are no coroutines or flows, and both Class and KClass parameters are supported.
  • It’s lightweight and modularized: can be imported as core-only for pure Java/Kotlin modules or as an Android library to enable automatic resource resolving.
  • It’s extensible: the most basic ways of resolving the active experiment’s variant are supported out of the box, including Firebase Remote Config, but custom resolvers are allowed.
  • It’s testable: the entry point is designed as a minimalistic interface, not a class, so if you need to provide fake/mock implementation for tests, that can be done in a couple of lines of code. If you ever tried to mock Firebase SDK you know what I’m talking about.



Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +768K followers.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dmitry Si

Software developer. Most recently Android Java/Kotlin engineer. Former manager, desktop and embedded software creator.