The Four Key Metrics from Accelerate

Daniel Yokoyama
9 min readMar 28, 2023

--

Para os leitores brasileiros e outros que prefiram o português, tem uma versão traduzida deste artigo aqui.

Last time we talked I was trying to explain how we can use a DevOps Culture to nurture High-performance teams. I spent a while talking about “The Three Ways”, from Gene Kim’s The Phoenix Project, and I mentioned Accelerate, another book by Gene Kim (along with Jez Humble and Dr. Nicole Forsgren), like an idea for a more “tactical” approach to stimulate a DevOps culture within the organization. Today I want to delve deeper into this approach, by exploring Accelerate a little further and see how it can establish a general guideline to assess how mature the teams are regarding DevOps and how they can improve it.

In the world of software development and operations, high performance is key. But how do you measure performance, and what metrics are the most important?

Measuring DevOps Maturity

Let’s just make things clear: I’m not in favor of adopting any DevOps maturity model in order to measure it within an organization. Accelerate clarifies that we should focus on capabilities instead of maturity, and I’ll stick to the book on this matter. The whole first chapter describes the authors’ effort to come up with measures and metrics that could indicate good performance teams, and how reliable these are when used to understand how the organization grades in those measurements.

By focusing on capabilities we refer to a team’s ability to perform certain technical practices that are associated with high performance. These technical practices include things like version control, automated testing, continuous integration, and deployment automation, among others. Measuring capabilities involves assessing a team’s level of proficiency in each of these technical practices.

According to Accelerate, there are four key metrics that high-performing teams focus on:

  • Lead Time
  • Deployment Frequency
  • Mean Time to Restore (MTTR)
  • Change Fail Percentage

These metrics provide a comprehensive view of software delivery and operational performance and have been shown to be strong predictors of organizational performance.

According to Kim, “The most successful organizations are those that are able to achieve high levels of both throughput and stability.” Throughput refers to the ability to deliver new features and updates quickly, while stability refers to the ability to do so reliably and with high levels of quality.

Let’s take a closer look at each of the four key metrics:

Lead Time

“Elite performers have lead times of less than one day, while low performers have lead times of more than one month.” — Accelerate

Lead time for changes measures the time it takes for a code change to be implemented and deployed to production. High-performing teams have significantly shorter lead times than low-performing teams, allowing them to deliver value to customers more quickly and respond to changing market conditions faster.

To be able to quickly restore services, high-performing teams use practices such as continuous integration and delivery, automated testing, and small batch sizes. By automating repetitive tasks and breaking work into smaller pieces, teams can reduce the time it takes to get code changes into production.

Deployment Frequency

“Elite performers deploy code up to multiple times per day, while low performers deploy code once per month or less.” — Accelerate

Deployment frequency measures how often changes are deployed to production. High-performing teams deploy changes far more frequently than low-performing teams, enabling them to respond to customer needs more quickly and iterate on their product more rapidly.

However, it’s important to note that deployment frequency should not be pursued at the expense of stability and reliability. High-performing teams are able to achieve both frequent deployments and high levels of stability through the use of practices such as automated testing, continuous integration, and continuous delivery.

Mean Time to Restore (MTTR)

“Elite performers have times to restore service of less than one hour, while low performers have times to restore service of more than one day.” — Accelerate

Mean Time to Restore service measures the time it takes to restore service after a production incident or outage. High-performing teams are able to quickly diagnose and fix issues in production, minimizing the impact on customers and reducing downtime.

To achieve short times to restore service, high-performing teams use practices such as monitoring, alerting, and incident response processes. By proactively monitoring their systems, teams can quickly identify issues when they occur and respond to them before they impact customers. When an incident does occur, teams with strong incident response processes can quickly mobilize and work together to diagnose and fix the issue.

Change Fail Percentage

“Elite performers have change failure rates of less than 15 percent, while low performers have change failure rates of more than 46 percent.” — Accelerate

Change Fail Percentage measures the percentage of changes that result in a failure or defect that must be addressed. High-performing teams have significantly lower change failure rates than low-performing teams, indicating that they are able to deliver changes to production with higher levels of quality.

To achieve low change failure rates, high-performing teams use practices such as automated testing, continuous integration, and continuous delivery. By catching issues earlier in the development process and ensuring that changes are thoroughly tested before being deployed, teams can reduce the likelihood of defects and failures in production.

It’s also important to note that the change failure rate should not be viewed as a punitive metric. Instead, it should be used to identify areas for improvement and drive continuous improvement in the development process.

DORA’s Fifth Metric: The Operational Performance Metric

That’s right… there’s a fifth metric for the 4 Key Metrics, much like the Three Musketeers (which were actually four).

In addition to these four key metrics, DORA (DevOps Research and Assessment) has an additional metric for Operational Performance, based on Reliability. It is a measure of modern operational practices, indicating how well the services meet user expectations, such as availability, latency, performance, and scalability. According to the 2022 State of DevOps Report from DORA, teams with varying degrees of delivery performance see better outcomes (e.g.: less burnout) when they also prioritize operational performance.

Fitness Functions may be useful to run performance and scalability diagnosis within the development workflow, whilst monitoring would ensure that the production environment guarantees consistency compared to that diagnosis.

What Does it Mean to Be a High-Performant Team?

I’ve been talking a lot about using DevOps culture to create High-Performant teams through an organization, and have brought those metrics to the discussion. Obviously, measuring how often a team is able to deploy, or how long it takes for a new change to make its way into production gives the team solid feedback about its own maturity. In Accelerate, the authors describe how they used cluster analysis to categorize team performance based on the data provided by the surveyees. They applied this technique for four years of research (the results are published by Puppet as The State Of DevOps Report, from 2014 through 2017) and found out that every year, there were consistently similar categories of software delivery performance in the industry.

Here’s how 2016 results compared to the data findings in 2017 illustrate how the 4 key metrics have helped cluster analysis to categorize performance (source: Accelerate):

2016 cluster analysis from Accelerate
2017 cluster analysis from Accelerate

* Low performers were lower on average (at a statistically significant level) but had the same median as the medium performers.

Note how the threshold that distinguishes performance categories for Change Failure Rate changes from 2016 data to 2017 data: Medium Performers improved their result from 21–45% in 2016 to 0–15% in 2017, being comparable to High Performers, while the opposite happened to the Low Performers, which had 16–30% in 2016, but in 2017 the analysis clustered them to be between 31–45%.

I recommend reading the book for more information about how the survey was designed and how the authors used the data. It is an excellent source of information, and it keeps going on explaining the whole research process.

How Platform-Engineering teams and SRE can help to improve on those metrics?

Now that we’re acquainted with the Four Key Metrics, let’s talk about how Platform-Engineering and SRE can contribute to teams improving their performance.

When it comes to stability, the “Operational Performance” metrics, and every good practice mentioned by DORA Report that contributes to this measurement, it turns out all boils down to what we call Reliability. It’s common sense that software fails once in a while, something wrong is going to happen, some piece of software is going to misbehave, or some arbitrary machine is going to malfunction, the internet connectivity is going to be lost, a number of things may happen (and they will, in Murphy we should trust). Reliability is a way to measure how solid is your monitoring systems and how capable you are to acknowledge that something is wrong and act fast, or even to prevent these scenarios from happening, by preparing automated tasks that anticipate such failures and make the whole system automatically reacts to failures itself (almost as if it would be capable to self-restore). That’s the purpose of Site Reliability Engineering (SRE).

First and foremost, the organization must be able to determine what kind of Performance Operation is considered “good enough” to maintain, and what is “bad enough” to act upon. That has to be defined based on how the user experiences the product: what is the latency baseline that impacts conversion, from which point availability is not worth to be improved, as it probably won’t be experienced by the user, how aware we are about the load capacity for the whole system, so we can respond when such load is exceeded, and we do this in a way that the system will be able to handle the new load.

To do this, SRE helps the organization define the Service-Level Indicators (SLIs) that are going to be measured, in order to define what would be considered “Good performance”, and when we should treat it as “bad”. These thresholds are called Service-Level Objectives (SLOs), and they help the teams to understand when something is off. In a future article, I hope to be able to talk more about SLIs and SLOs, as well as explore Error Budgets and explain about Observability.

Now, when it comes to Platform-Engineering, a lot could be achieved by the software delivery teams when a set of tools are in place to make the whole SDLC smoother and more reliable. It ranges from the simplest things like providing a Versioning Control System and Build-Automation tools, allowing Continuous Integration and Continuous Delivery, to bringing tools to the workflow to improve security (shift-left on security) and automate the whole operation for the system and its parts (configuration management, rollout strategies, circuit-breaking, etc), ideally abstracting all the underlying plumbing in order to make the whole process of publishing services into production easier over time.

There’s no doubt we’ll have a lot to talk about in the future to address how a Platform-Engineering team can contribute to creating High-Performance Software Delivery workflows.

However, Software Delivery Teams are still in charge of how to control the quality of their software, by automating tests and running code-quality checking tools (code standards and metrics).

Conclusion

The four key metrics presented in the book Accelerate — Lead Time, Deployment Frequency, Mean Time to Restore (MTTR), and Change Failure Rate — are powerful tools for evaluating the performance of software delivery teams and identifying areas of improvement. These metrics provide an overview of software delivery and operational performance and have been shown to be strong predictors of organizational performance.

By focusing on team capabilities rather than their DevOps maturity, companies can gain valuable insights into their ability to deliver high-quality software quickly and safely. With the proper practices, such as version control, automated testing, continuous integration, and deployment automation, among others, teams can improve these four key metrics and achieve high-quality performance.

In summary, adopting a DevOps culture is essential for companies seeking to achieve high performance in their software and operations projects. With the help of automation practices and appropriate tools, teams can significantly improve their ability to deliver software quickly and safely while maintaining high levels of quality and stability. An ongoing evaluation of the four key performance metrics can help ensure that teams are on the right track to achieve these goals.

P.S.: Special thanks to my friends: Daniel Pilon and Frederico Vitorino for help with proofreading. I love you guys!

--

--