Performance metrics for high functioning data teams (Miniseries Part 4)

The DORA metrics have been widely accepted as the gold-standard for “what good looks like” in software engineering. These are not perfectly applicable to high-functioning data teams, largely because the commitment of code is not the sole determinant of value for data teams. In this article, we discuss different types of data team, and how these metrics may vary in their applicability accordingly.

9 min readAug 23, 2023


On Orchestra

We believe data tooling should exist in the context of releasing quality data and running quality data operations. We believe framing the capabilities of a data tool within the context of how it can help release data in a more robust way is necessary for the elevation of data teams into the stratosphere of value, occupied only by the greatest individual contributors in software teams. What we’re building will give Data Teams the ability to deploy data in an extremely robust way in a fraction of the time and with a fraction of the cost, leaving data teams time to focus on what really matters: creating business value.

Learn more about Orchestra here.

What are the DORA metrics?

The consultancy founded by the authors of accelerate is called DevOps Research and Assessment, or DORA for short.

Having settled on 24 characteristics of effective Data Teams, the authors naturally needed a way to measure efficacy. Exhibiting a characteristic like “does continuously integrate code” is binary, and has no intrinsic value. Shipping more features in a given week is not binary, and does have intrinsic value, as it’s a measure of the outcome variable, chosen here to be broadly “efficacy” or “efficiency”. These metrics are outlined in the table below:

Deployment frequency is a measure of how often code is released to end users, and is intrinsically linked to working in small batches and doing continuous integration and delivery. If CI/CD is in full swing and developers work in small batches, code should be released frequently and in small batches.

Lead time of changes is essentially the same thing as velocity. Lead time is essentially the average time taken to complete a unit of work, whereas Velocity is more widely known as the number of units of work a team can complete in a given time; these two are naturally inversely proportional. The report indicates that lead time for the best teams is between one day and a week, which is a very wide range. When we think of the units of work software teams are accustomed to (Epics, features, and tasks / stories) a day feels like a lot for a task but not much for a story. We conjecture that what good looks like for this metric is highly variable and team-specific.

Time to restore service is a measure of recovery, and identifies holes in monitoring and observability of software systems as well as CI/CD (as typically recovering from a bug requires an additional push of code). This is an important metric for larger organisations, and we’ll see it is extremely important for Data Teams.

Change failure rate is a measure of quality. Generally speaking, developers should push code frequently, push lots of code, and push high-quality code. Change failure rate is notoriously difficult to measure since tying incidents to work items is challenging to do in an automated fashion, however the % of bugs vs. non-bug work items can be used as a convenient proxy here.

Comparisons to Data

It would be tempting to assume the DORA metrics are sufficient for assessing the efficacy of a data team. After all, Data teams ship code, and that code can therefore be measured on the same scale.

There are a few issues with this approach:

  • Data teams generally only build software applications for the sake of moving and transforming data. We aren’t building software applications that are used by customers. This means that the different metrics have different definitions in a data context. For example, “recovering from an incident” has a completely different meaning in a Data context.
  • Shipping code has no intrinsic value for Data teams, as data teams ship data as well as code. Deployment Frequency and Lead time definitions would also need to change.
  • A degraded service in data parlance would include broken parts of a data pipeline, but also poor quality data or broken end-user tools such as dashboards or CRM tools


  • Data teams are generally focussed on delivering business value. This means it would be inappropriate to ignore cost/benefit entirely from performance
  • Data quality is a prerequisite for ensuring that the product data teams deliver is usable and does not degrade. Intuitively data quality would be a good measure here.
  • The processes by which data is released into production are generally much more complicated than with software; data needs to be moved, transformed, tested, and so on and so forth. This can happen via multitude of ways. Intuitively, it feels like pipeline completion rates would be a helpful, more granular measure, than simply deployment frequency

Understanding different Data Team profiles

Before proposing metrics that are important for understanding the efficacy of a Data Team, it is worth pointing out that Data Teams serve different functions in different organisations. One of the most profound things in Accelerate was how there were some things that were relevant to software teams irrespective of the organisation they were part of. Continuous Integration is a good thing, if you work at a start-up of 1 or at a Big Tech company.

In Data, there are some things that stand out here too. For example, having a culture where the importance of data is understood, and data producers are concerned with raw data quality is undoubtedly important at organisations of any size. However, there are also grey areas.

For example, it’s tempting to think that having separate environments for staging and production is desirable for data teams of all sizes. However, creating a complete copy of all data for a start-up may simply be an exercise that is too expensive to be justified for a very small company.

An important consideration is whether cost and business value should be a factor in assessing the success of a Data Team. Given Data is fundamentally tied to business value (there is typically no intrinsic value in having a Data Team, whereas in many organisation software teams are fundamentally indispensable since without them, the business has no Product), we believe it makes sense to consider business value as something to consider when assessing the efficacy of a Data Team. It is also intuitive to say that a Data Team that can operate on a $200/month budget that can do the same things as one that operations on a $200,000/month budget is superior, whereas we might not want to make that conclusion for software engineers.

With this in mind, we can say data teams operate on a maturity spectrum that is typically associated with organisation size but not always. For example, some data-heavy businesses may accelerate the data team operation even if the company is small and young. Bricks and Mortar businesses are highly unlikely to do this, and are only likely to require a Data Team once a certain scale is reached. This can be seen in the diagrams below:

Mature organisations who can make use of Data

In this diagram, we see that the Business Value accrued is maximised w.r.t cost by having a Data Team proportional to D*. This would be typical of a large business where there is a sufficient amount of data that drawing insights from it results in tangible improvements in operational efficiency.

Organisations with a medium-level of maturity

In this example, we see that in a medium-sized business the level at which the benefit is maximised w.r.t cost is when the data team is 1 person large. There is actually a negative net benefit beyond a certain point. This is typical of late stage start-ups who are starting to amass data and have a sufficiently mature infrastructure to make use of analytical insights to improve operational efficiency

Immature organisations / non-data-driven orgs

Many businesses find it optimal to not have a data team at all. These would be start-ups, or small “Bricks and Mortar” businesses who have greater operational gains in other areas such as marketing or distribution to realise before moving onto data and analytics.

In software, it’s taken for granted that software engineers prioritise the highest impact features and minimise cost of tooling. In Data, understanding what the highest priority features are is non-trivial; a Data Team who is even able to identify the highest priority projects will already be doing better than many. So too, is minimising cost, because of how many different options there are to do the same thing. What can be seen in the diagrams, however, is that if simply achieving a net positive business value for a data team is the fundamental goal, then business value is likely to be less important in large / data-mature organisations, due to the scale of the opportunity presented to a small data team. Cost, and the realisation of Business Value, is more of a priority for smaller/ less data-mature organisations.

A Version 1 of the DORA metrics for Data

We suggest the following definitions could be a good starting point for measuring Data Team performance:

(1) Deployment Frequency

Defined as: The frequency with which new data or different data is deployed into Production

Measured by: Integrations to GIT providers; records of data versions should be in code

Good: Daily (or less)

Medium: Weekly

Bad: Every two weeks

(2) Lead time

Defined as: the time it takes to go from idea to data released in production

Measured by: Integrations to GIT Providers

Good: Daily

Medium: Weekly

Bad: Every two weeks

(3) Time to fix data quality issues

Defined as: the time it takes from a data quality issue arising to the time that data quality issue is fixed in production

Measured by: Integrations to GIT Providers, manual logging

Good: Hours

Medium: Days

Bad: Weeks

(4) Change failure rate

Defined as: the percentage of changes to production data that result in data quality issues

Measured by: data quality issues in (3) divided by the number of merged code changes

Good: 0%

Medium: 25%

Bad: 50%

(5) Cost/Benefit ratio

Defined as: the net benefit from the data operation

Measured by: cost ($) including salaries, ideally some tangible $ number for the benefit

Good: 0.5 or lower

Medium 0.8 or lower

Bad: 1+

(6) Data Quality

Defined as: the extent to which data quality tests pass in Production and the extent to which data in Production is updated and recent

Measured by: test outputs

Good: 95% or above

Medium: 75% or above

Bad: 50% or above

(7) Data Release Pipeline reliability

Defined as: the % of releases to production that succeed

Measured by: pipeline run outputs

Good: 95% or above

Medium: 75% or above

Bad: 50% or above


In this article we discussed the DORA metrics and saw how they were suitable for assessing the performance of software delivery teams. We saw how transferring these directly to Data teams is likely to be insufficient to assess a Data Team’s productivity, and amended some of these to reflect the fact Data teams ship data into production in addition to code. We also asserted that given the nature of the role is fundamentally different, and is business value focussed, it makes sense to include a reference to Cost/Benefits of the operation in order to assess performance. Finally, we also added Data quality and Data Release Pipeline reliability metrics, which go one step further than deployment frequency and lead time given the typical release pipeline for data has a significantly higher amount of complexity than a software release pipeline.


  2. 2022 State of DevOps report:



I write on Data engineering and the coolest data stuff. CEO@ Orchestra, the best-in-class data pipeline management platform.