DORA metrics

Dharin Parekh
AnalyticsVerse
Published in
4 min readFeb 12, 2022

We have always believed in the ideology of measuring things to improve engineering efficiency. What DORA’s research program has done for this is represent seven years of research and data from over 32,000 professionals worldwide and identified capabilities that drive high performance in technology delivery and ultimately organizational outcomes.

The four key capabilities give an idea about the speed with which your organization delivers software and the stability of your software systems. These four metrics have been found to be linked directly with business outcomes. These four metrics allow teams to benchmark against the industry and identify themselves as low, medium, high, or elite performers. Elite performers have consistently been found to outperform others on these four metrics.

What are these four metrics?

The DORA framework essentially looks at four metrics. Deployment Frequency and Lead Time of Changes are used to measure DevOps speed, while Change Failure Rate and Mean Time to Recovery are used to measure stability.

While achieving speed and stability may sound to be contradictory( achieving speed may seem to mean compromising on stability), DORA’s research shows that it is possible to optimize for stability without sacrificing speed.

The Four DORA Metrics

Deployment Frequency

Deployment Frequency represents “How often an organization successfully releases to production”.

Improving on CI/CD capabilities helps in improving on this metric. Aiming to achieve continuous deployment in itself implies a high deployment frequency. Being able to deploy frequently allows you to deliver value continuously to your users which also helps in faster feedback cycles.

Elite teams have been reported to deploy multiple times a day. By comparison, low performers

have been reported to deploy between once per month and once every six months. Companies such as Amazon, Google, and Netflix deploy thousands of times per day (aggregated over the hundreds of services that comprise their production environments).

Lead Time for Changes

Lead Time for Changes represents “The amount of time it takes for any change to get into production”

This essentially can be measured as the amount of time it takes from code committed to code successfully deployed in production. This number could be higher due to reasons such as different test teams, staged testing environments, complicated review procedures etc.

Elite teams have reported change lead times of less than one day while low performers required lead times between one month and six months

Change Failure Rate

Change Failure Rate represents “The percentage of deployments causing a failure in production”.

This essentially can be measured as the percentage of deployments causing failures divided by the total deployments. This metric is indicative of the stability of your deployments. A higher deployment frequency coupled with a higher failure rate is not a desirable outcome as it indicates the optimization of speed at the cost of stability. Another factor that could be taken into consideration while measuring CFR is the availability of the service. Low availability is also not a desirable outcome.

Elite performers reported a change failure rate between zero and 15%, while low performers reported change failure rates of 46% to 60%.

Mean Time to Recover

Mean Time To Recover represents “How long does it take for an organization to recover from a failure in production”

This essentially can be measured as the time when a failure occurs in production to the time when it gets fixed. This metric is indicative of how fast can you recover from any failures or incidents. Lower MTTR can be achieved by doing smaller deployments and having good monitoring tools to quickly identify what must have caused the failure.

Elite performers have reported mean time to recover of less than one hour, while low performers have reported MTTR between one week and one month

Why and How to use DORA metrics?

These metrics are a great way to understand where your organization currently stands and how does it compare to others in the industry. Also, they are proven and backed by research and it is much easier to benchmark your organization against other organizations using these metrics.

The research bucketed these four metrics as follows:

Becoming an elite performer has a positive outcome on organizational performance( this includes profitability, productivity, and customer satisfaction ). For context, the State of DevOps Report 2019 shows the following about elite performers

Now if you are convinced about tracking these metrics, the next step is to figure out how to track them. You could gather this data manually every week/month and observe trends for them. But, this process is time-consuming and could take lots of effort to gather this data. Or you could use an engineering intelligence platform like AnalyticsVerse that automates this whole process and tracks these metrics continuously. AnalyticsVerse also helps diving deep into these metrics and helps in improving them.

Now that you have the data for these metrics, the next step is identifying how to improve on them. There could be a number of reasons that could contribute to undesirable values for these metrics. Some of them could be:-

  1. Having a complicated and time-consuming change approval process could lead to lower deployments
  2. Having insufficient testing capabilities could lead to higher failures
  3. Not having good monitoring capabilities could lead to higher times to recover from failures.

AnalyticsVerse helps companies in these cases a step further by identifying bottlenecks and gathering insights across a number of areas which makes your team self-sufficient to continuously improve and move towards becoming elite performers.

Learn more about this here

--

--