The 10 Limitations of Firebase A/B Testing (ft. Hackle A/B Testing)

Jamie Lee
Hackle Blog
Published in
7 min readMar 28, 2022

What is Firebase A/B Test?

Firebase A/B Test is a tool provided by Google that allows tech startups to track and measure the impact feature changes have on their app on important company-wide KPIs and metrics in order to create the app that users want.

Firebase A/B testing is a popular tool for many beginners who are just getting started with A/B testing. Moreover, using A/B testing from Firebase is also an attractive choice for users who have already integrated with the Firebase platform to use some of the other tools in order to build their apps.

However, as users continue experimenting on their apps, they will slowly start to realize that there are many limitations to the type of A/B tests that can be carried out with Firebase. Moreover, as the data starts to pile up, users will start to question the reliability and accuracy of the data collected by Firebase.

Here at Hackle, we precisely understand the distinct needs that many users have with regards to A/B testing. We designed the right platform that not only accurately tracks data on unique users but provides users with the flexibility of designing the right A/B test required to adapt to each and every company’s unique needs.

This post outlines the limitations of using Firebase’s A/B Testing tool.

The 10 Limitations of Firebase

  1. Firebase offers only a few types of predetermined metrics.
  2. Firebase doesn’t allow you to segment data results for each metric.
  3. Firebase does not offer advanced user targeting features.
  4. Firebase takes time to load changes to an A/B test setting.
  5. Firebase does not update A/B test results to the dashboard in real-time.
  6. Firebase mainly focuses on client-side SDK.
  7. Firebase samples data on users allocated to A/B tests.
  8. Firebase may be prone to errors when calculating data for metrics related to conversion rates.
  9. Firebase has a limit to the number of metrics that can be set per A/B test.
  10. Firebase does not support simultaneous A/B testing on both the production and development environments.

Detailed Description

Firebase only offers only a few types of predetermined metrics.

You cannot create new metrics with Firebase. This means that you cannot create your own rules for calculating user data and you are only left with using the built-in rules for processing your user data for your A/B test results.

Source: Firebase dashboard

However, with Hackle, you are given the freedom to create highly customizable metrics by directly inputting the desired event as the denominator or numerator of the metric. On top of the metric equation, you can also customize how you calculate the event values. Having the freedom to customize the denominator and numerator of the metric equation means that you are able to track and measure important metrics such as conversion rates, average order amount (AOV), the average purchase price per user (ARPU), the average purchase price per purchaser (ARPPU), etc.

Additionally, there is also a “filter” setting that allows you to only calculate the metrics for specific user segments.

You can set your own desired metric, such as ‘average purchase amount per user who purchased an item or triggered a purchase event’. (Source: Hackle Dashboard)
Having a customizable setup is important to track the metrics that are important.

Firebase doesn’t allow you to segment data results for each metric.

Data Segmentation allows you to organize data into different groups, segments, and properties. This adds value to your data as not all users follow the same behavior which means you are able to extract key insights based on each user group. Hence this allows you to determine whether or not certain metrics show significant results for only specific groups. Data segmentation is not possible in Firebase.

With Hackle, segmentation analysis is possible and you can split your results data based on the platform (iOS, Android, Web, etc.), browser, app version, or properties information managed internally by the customer (ex. membership status, first purchase status, gender, age group, etc.).

(Source: Hackle dashboard)

Firebase does not offer advanced user targeting features.

Hackle provides a far more advanced and highly customizable targeting feature than that provided by Firebase. This means that with Hackle, you can expose the A/B test to a very niche pool of users by allowing multiple conditions to be set during targeting.

(Source: Hackle dashboard)
(Source: Firebase dashboard)

Firebase takes time to load and update changes to an A/B test setting.

The SDK’s role is to periodically receive the changes made on the A/B Test settings from the dashboard and reflect the changes to your code.

Compared to the Firebase SDK, the Hackle SDK has a significantly shorter cycle to update configuration changes, allowing your A/B test to be updated to new changes near real-time.

To find out more about SDKs click here.

Firebase does not update A/B test results to the dashboard in real-time.

Time is a very important concept in A/B testing.

Firebase provides experimental results based on data collected on Google Analytics. This means the numbers reflected on the dashboard are data from the previous day. In some cases, the data is collected from almost 24 hours ago. This is crucial as team members need to judge and respond to the current situation for many decision-making processes.

Hackle updates the experimental results at least once an hour, hence well-reflecting the current state and responses of users in order to make a final decision.

Firebase mainly focuses on client-side SDK.

It is much more efficient to utilize the server SDK when performing A/B testing for logic that is performed on the server, such as search logic and recommended store listing logic. Because the code work only needs to be done on the server.

On the other hand, if you use the client SDK when performing A/B testing of server-based logic, additional code work must be done on both the server and the client (Android, iOS). Hence, this requires a significant amount of development resources and time. This is because backward compatibility and deployment timing of the app must be considered, and code deletion must occur after the A/B test has ended.

Firebase samples data for A/B tests with users over a specific limit.

Firebase A/B Testing samples data, which in some cases, may provide statistically insignificant data.

On the other hand, Hackle A/B testing provides results based on full data without sampling. This means more accurate metric calculations and more reliable data.

Hackle’s user data pool VS Firebase’s sampled user data pool

Firebase may be prone to errors when calculating data for metrics related to conversion rates.

In order to calculate a specific conversion rate (numerator/denominator | eg button click rate, purchase conversion rate, etc.), the denominator and numerator values ​​must coincide.

However, in Firebase A/B Testing, there is a possibility that the time when users are distributed to a specific test group (A or B) of the A/B test (Denominator 1) and the time when the user is exposed to the screen where the A/B test is in progress (Denominator 2) may not match. In this case, the denominator and numerator values ​​are measured at different points, which may contaminate the results.

In the Hackle A/B test, the point at which users are distributed to a specific test group (A or B) of the A/B test (Denominator 1) and the point at which the user is exposed to the screen where the A/B test is in progress (Denominator 2) always matches up. Since this time coincides with the time the numerator event values are measured, Hackle’s data is usually always accurate.

Firebase has a limit to the number of metrics that can be set per A/B test.

Firebase is limited to a total of 6 metrics (1 primary metric and 5 additional metrics) that can be tracked and reported in a single A/B test.

With Hackle’s A/B testing, there is no limit to the number of metrics created and hence, various metrics can be monitored simultaneously.

Firebase does not support simultaneous A/B testing on both the production and development environments.

(Source: Hackle dashboard)

Firebase does not make a clear distinction between the production and development environment, developers can easily be confused between different environments. Making a clear distinction between different environments is especially important when implementing A/B testing. Usually, developers like to test out the A/B tests in the development environment for QA purposes. However, if the distinction between the two environments is unclear, mistakes can be made where the A/B test is released to real users in the production environment.

Hackle provides both a production environment and a development environment for each and every A/B test, eliminating the possibility of such mistakes. Hackle provides both a production environment and a development environment when conducting a single experiment, eliminating the possibility of such mistakes.

Check out Hackle at www.hackle.io in order to start creating your own customizable A/B tests and metrics that really matter.

--

--