How to apply DORA metrics for mobile development

Everything You Always Wanted to Know About DORA in mobile* (*But Were Afraid to Ask)

rolgalan

Published in

The Glovo Tech Blog

7 min readNov 17, 2022

Main Control Room at ESA’s Space Operation Centre: so basically a lot of monitors displaying multiple measurements and people watching them carefully during a mission. — Attribution: ESA/J.Mai -, CC BY-SA IGO 3.0

Intro

A few years ago the Google SRE team presented the DORA key metrics to measure the performance of a software engineering team. Last year they introduced a fifth one (reliability).

This caused great enthusiasm in the industry, and multiple companies started to follow them. However these metrics are highly oriented to a backend environment, and due to the nature of mobile development (constrained by the distribution through Google and Apple stores) they might not look relevant or easy to use.

Another major caveat with applying DORA metrics to mobile is that they are generally focused towards production systems, and specifically on controlling damage to and limiting change in them, which is particularly hard when you do not fully control the distribution process.

I strongly believe that DORA metrics are still valuable for mobile, but you need to adapt them first to be more meaningful in this environment by focusing on the parts you actually have control over. I’ve seen several people asking for more details about this in different places, so I decided to summarize in this post how to follow and adapt DORA metrics for mobile development.

Besides DORA, CircleCI promotes another 4 key devOps metrics to watch closely. They might look similar, but DORA focuses more on production, whereas CircleCI metrics are more centered around the actual CI/CD process. The two frameworks share similar concepts, and we will reference both in this article.

The metrics

Deployment Frequency, when doing production weekly releases (or however long your cycle is , hopefully not longer than two weeks 🙃) it doesn’t make sense to measure at all, as there is actually not much to improve or change.

But you still want to measure the time your teams take to produce new mobile features. Even if you don’t release every PR merged into your development branch to production, it’s still important to keep watching how many are merged each cycle. So measuring the throughput in your develop branch is key to ensure a healthy development environment. Throughput actually is the name that CircleCi gives also to this same metric.

How can you track it? You can get info from your CI/CD system. If you don’t have that, you can still get this easily from the Git tree, or even from the Github API with some other valuable information.
How do we track it in Glovo? We use Jenkins for CI. It sends metrics to our observability system (Datadog) for each finished job, so we can track it from there.

Mean Time To Recovery (MTTR), this is a bit more complex. If you deploy a hotfix in backend or web, as soon as it is released it will instantly recover the systems. But how do you track this in mobile? You need to wait for all users to update until the problem is actually mitigated, which could take months (unless you have some force-update mechanism and you are willing to use it at the high price of annoying your customers). Maybe just when the affected version drops a certain threshold? Still the MTTR could take weeks and you can’t control it.

The good news is that you don’t always have to publish a hotfix to solve a mobile outage. Most probably you have some Feature Toggles or Kill Switches in place to immediately disable the buggy code. Or sometimes you might even resolve it with a new backend deployment. So you need to define what recovery means for you. In our case, we consider an incident resolved when a fix is available for everyone and no new users are experiencing degraded experience.

How can you track it? If you have alerts for your crashes you can measure your MTTR from there once the monitor has been recovered. However, no monitoring system is perfect, so you might have many false positives that are auto-resolved and that will disturb your metric.
How do we track it in Glovo? We write Post Incident Reviews after each meaningful outage in our systems. We record the time it takes to recover from all of these incidents and keep track of it in our systems.

Change Failure Rate, this could be as simple as checking how many hotfixes you publish. However in Mobile there are plenty of minor defects in each release that don’t justify a hotfix, but that you still want to fix before the next release. In my opinion you should also measure these as part of your CFR and get the ratio of defects/commits on each release to get something similar. But in order to get that, you first need the team to clearly mark the commits that are fixing something.

How can you track it? Following the idea of marking the commits, you can follow Conventional Commits or any other similar strategy to identify the purpose of each commit merged into the many branches. Alternatively, you can keep this information as part of your ticketing system.
How do we track it in Glovo? We are not really measuring this company-wide. Some teams do it by themselves in different ways.

Lead Time for Changes. This is the most unrealistic one in mobile, as you depend on the store’s review, which is not under your control. But if you want to get a better understanding of your system from a devOps perspective, you could track how long your pipeline takes to execute the build that publishes to the store. When you are doing a release every 1/2 weeks this isn’t so important, but when someone is trying to publish a hotfix it is critical. Doing this would be the same as the Duration metric proposed by CircleCI.

Depending on how your release process is, it might make sense to measure delays in your release windows as part of this. In our use-case, we have a beta app released internally for all Glovo employees 5 days before the actual release to production. When that moment arrives, we might have identified relevant issues in the beta worth resolving, delaying the release. Even if we are not systematically tracking this, it might be worth keeping an eye on in the long term to ensure a healthy process.

How can you track it? You should have some observability from your CI/CD tooling.
How do we track it in Glovo? Same as for the first one, we get build time metrics into our Datadog for each job, so we check this duration from there.

4 different graphs in this image: build throughput week over week, develop build failure rate, success release jobs duration by stage (we track time to build and prelease in Google Play and also the time to promote that build to production, and also the increase rollouts that are also automatically done by the CI/CD), and finally Pull Requests average duration week over week. There is a noticeable decrease in build time during the last months in both of the latest graphs. — Some of the metrics we track internally in the mobile repos at Glovo

DORA vs CircleCI metrics

On our journey to adapt DORA metrics to mobile we have seen how some of these metrics map precisely to the CircleCI devOps metrics, while others are slightly different due to DORA focusing more on production, and CircleCI metrics being more about the actual CI/CD process.

Both frameworks talk about tracking MTTR. However, DORA is measuring production incidents, while CircleCI metrics records incidents in the actual pipeline (such as test/code conflicts due outdated branches merged).

In the same way, DORA Change Failure Rate focuses on failed releases to production, whereas CircleCI tracks the success/failure of the actual build, regardless of what it is.

With the Dora Lead time for changes described above to reuse the concept of CircleCI Duration metric the same applies: it is just about releasing to production. While it is important to measure production releases, it’s also key to keep tracking the duration of any of the other relevant builds (like PR checks that might block your team from merging new features, and that will impact the throughput of your engineers).

Conclusion

I hope this article added some clarity about how to apply DORA and other DevOps metrics to mobile environments. Considering the constraints of mobile development (mainly the lack of full control over distribution since it is managed by the stores) these are the most suitable ways in our opinion to keep track of these metrics. Regardless of whether you are doing native Android/iOS development or any hybrid mobile approach, this should apply for you as all mobile development shares the same kind of challenges that make it different from backend or web development.

Remember also that if you are tracking the productivity and performance of your teams, these are not the only metrics to evaluate. There are many other options. In particular, you should look into the SPACE framework for a more comprehensive set of metrics to analyze productivity from a multidimensional point of view, considering not only technical metrics but also developer satisfaction, collaboration and well-being, which are as important for productivity as the more software-focused metrics.

Last, but not the least: we are here for business. We are not getting paid for doing more releases, but for doing releases that have a positive impact on our customers and in the business KPIs. So tracking DORA or other devOps/productivity metrics is important and a good indicator of the health of your internal systems, but it does not tell you the whole story about how well you are working. Never forget keeping a close eye on those too.