Deep Dive Into A/B Testing With Firebase Remote Config

Published in

AndroidPub

9 min readFeb 21, 2018

In today’s user centric world where companies try their level best to win their users’ hearts, it has become indispensable to capture each and every user data point in order to make data driven decisions.

While planning new features it becomes very confusing if you have a lot of variations in mind. No matter how good your hypothesis be, ultimately its the users, their mindset, their behaviour, their habits which govern the validity and correctness of your feature.

A/B Testing transforms “I guess, I suppose to I know” by letting us distribute different versions of our product among users and determine in real time which version emerges as the leader. We can then roll out that version to all the users.

In other words,

AB Testing gives us valuable insights straight from the horses mouth — The Users in real time.

Lets’s see how we can carry out A/B Testing using Firebase Remote Config.

Firebase Remote Config Overview

Firebase Remote Config provides a parameter based approach in which you can load different config parameters from firebase cloud and change the behaviour your app as per those parameters.

Setting up Firebase Remote Config

Assuming firebase has already been setup in your app, add the following to your app/build.gradle.

Setting Remote Config Parameters on Firebase Console

1.) Go to firebase console and choose Remote Config option.

2.) Click on Add your first parameter.

3.) Add the parameter key and it’s corresponding value. My sample app is a memory game in which you get certain seconds to memorise a sequence of images randomly distributed on top of the same number of tiles. After the memorising time elapses, one of the image is shown and you have to tap the correct tile.

4.) After adding the parameters click on Publish Changes.

Android Code

Declare an instance of FirebaseRemote Config in your activity or any other class which handles network calls. For simplicity I am fetching the config in my Activity.

The following function fetches config data from firebase remote config.You can call it onCreate() method of your activity.

We set default values for the remote config by calling setDefaults method containing a hashmap of the defaults.

We asynchronously fetch the remote config values by calling firebaseRemoteConfig.fetch(). The fetch method also accepts cacheExpiration as a parameter. Cache Expiration is the time in seconds after which the config cache gets cleared and new values are again fetched from firebase remote config. While developing you can set it to zero, so that every time you fetch the new values in order to test the changes.

Also while developing call isDeveloperModeEnabled(true) as the SDK imposes rate limit on the number of calls .Using isDeveloperModeEnabled(true) this restriction gets removed.

firebaseRemoteConfig.setConfigSettings(new FirebaseRemoteConfigSettings.Builder().setDeveloperModeEnabled(true).build());

Ok. So that we know, how to use the Remote Config and make changes to our app without rolling out an App Update :

Let the A/B Testing Begin

In a nutshell, A/B Testing using Firebase Remote Config involves the following Steps:-

1.) Create an experiment

2.) Select target users

3.) Define Variants

4.) Define Goals on the basis of which you want to find the best version(leader)

5.) Analyse Results

Creating an experiment

1.) Click on A/B Testing button on the top-right corner under Remote config section.

2.) Fill in Experiment Basics Section. As you can see in the image below, we are defining the name of our experiment, selecting the app to perform experiment on, selecting target users with specific properties and also the percentage of those target users on which the experiments is to be run.

The drop down to filter out target users would look something like this. Along with some predefined options you can also link your Firebase Analytics properties here.

Defining Variants

Now you would want to create different variants containing which vary from each other in terms of config parameters defined in Firebase Remote config which are further mapped to different components in your app. In my case, I am creating two variants, one in which I have increased the memorising time from 12 seconds(default) to 20 seconds and another version in which I have decreased the number of tiles to memorise from 9(default) to 6 .

The Control Group is a subset of the Target Users who would receive the normal version of your app without any variations. The behaviour of the variants is compared with this control group to measure their success.

Defining Goals

In this part we define on what factors we want to determine the performance and success of different variants. The primary goal of me decreasing the number of tiles and increasing the memorising time was to decrease the complexity of the game, thereby trying to increase user engagement. A difficult game would certainly demotivate and frustrate users and may become a reason of drop-off.

As you can see in the picture below we have chosen user_engagement as a goal and other metrics like Retention. You can also link different properties which are defined in your Firebase Analytics.

In the Advanced options, you’ll find another drop down for the Activation Event. An activation event is like an entry point or a mandatory condition which should be fulfilled or triggered by the user in order for him/her to become a part of your experiment.

Please note, that the user might have gone through the activation event in the past before you running the experiment , but that won’t be considered. The Activation event has to happen while your experiment is running.

Click the review button and you’ll be directed to a page where you can see the experiment details you just defined.

Adding Test Devices

You can also add test devices by specifying the Firebase InstanceId of your test device and the variant you would want that device to get.

Click on the Start Experiment Button to begin your experiment. After doing this you would see the following the screen.

So yes. It indeed is too difficult to find the best variant-The leader. So, lets sit back and wait for at least two weeks to find out the results.

Analysing Results

Congratulations! for successfully performing the above steps and reaching this section.You can press the claps button multiple times to show your happiness.

Now comes the most important and the most critical part of A/B Testing — The Result Analysis.We are curious to know how have the variants performed, who is/are the leader(s), performance statistics and most importantly what final action should be taken, i.e which variant to Fully Rollout or not to roll out any.

So let’s see what does our experiment tell us.

Since my app is not a real one and is not live on the play store I got the following.

Well, that was expected it’s difficult rather impossible for anyone to finally tell you the best variant if your sample size is less.

The job of an A/B Testing tool is not to simply tell you that certain variant(s) performed well but also to ensure that this better performance was not due to random chances but actually because of the change the variant(s) is/are infused with.

For this, Firebase applies Statistical Algorithms under the hood which in turn need a descent sample size and a descent time period.

So lets’s take an example of a live app which gets a few thousand monthly active users and try to study it’s A/B Testing Test Results.

This app wants to find out if removing ads improves user retention.

Let’s have a look at the results.

So as per the results, we don’t have a leader at all . That means none of the variants seems to be better than the control group if user retention is our matter of concern. Pretty Interesting ! This also means that the ads the app is showing do not affect user retention.

Massive Kudosssss to Firebase for letting the developers know that Ads have little effect on user retention. Had they simply stopped or reduced ads based on their intuition or some hypothesis , they would have simply cut their source of profit without gaining anything concrete.

Let’s also have a look at the Improvement Overview section which quantitatively tells us how have the variants performed as compared to the control group corresponding to different metrics we had set as goals.

As you can see corresponding to every goal on the top row we have different metrics and their quantitative comparison with the control group.

Values like -7 to 13% in terms of improvement , 71%(around 90% is considered significant; it’s 50% at the beginning of the experiment) in terms of Probability to be the best variant, are highlighted in grey emphasising that these numbers or values are not significant enough to declare the Ads disabled variant the leader.

Conclusion and Key takeaways

1.) Perform A/B Testing only if you have descent number of active users.

2.) Sometimes not getting a leader is as beneficial as getting none.

This we have seen our ads disabling experiment. Was a life saver and an eye opener for the developers.

3.) A/B Testing is not for everyone

If you are a young startup rolling out features with the speed of light, then you should not consider starting A/B testing right away. Let the product become stable and then you can do those subtle changes to your app and perform A/B Testing.

4.) Avoid Configuring Unrelated Parameters in a Variant

Do not put config values for parameters which are not related, in a variant. On doing so you won’t be able to find out which of the parameters brought changes to your variant while analysing results. Take for example had I put a combination of memorising_time and no_of_tiles in the same variant and would have got an increase in user_retention, I would not have been able to find out as to which change brought that increase, the increase in guessing time or the decrease in the no_of_tiles.

5.) A/B Testing is just like Icing On the Cake

A/B Testing is only to test subtle changes in your app and that too when you have a somewhat stable product. If you are building something from scratch, market research, domain knowledge, design thinking are very important aspects you should focus on in order to make your product fluffy and spongy to the core like a cake and then putting the icing layer of A/B testing after attaining stability of the product with descent number os users.

So that was it. Thank you so much for reading the article.Your valuable suggestions and feedback are always welcome.

Please don’t hesitate in pressing the claps button as many times you want if you like the post.

Follow me on twitter, for more updates. https://twitter.com/akapil167