John Lewis Partnership — Measuring Our Delivery Performance Using DORA Metrics

--

Hi — I am Rob Hornby (Product Lead) for the John Lewis & Partners Digital Platform. Today I’m going to run through the work we’ve done using DORA to evidence how we have increased our delivery performance enabling us to go from 10 deployments to 5000 a year.

Introduction

In the past few years in John Lewis our teams have moved from deploying 10 times a year (on our legacy commerce service) to over 5,000 a year through our new digital services. We’ve done this through focusing on measuring the four key metrics detailed in DORA and improving the capabilities they outline.

Background

Historically the John Lewis Partnership (which includes John Lewis and Waitrose) has measured our delivery performance through traditional service management metrics, which focus on ITIL metrics such as change, incident, problem. These can be informative, they don’t provide a holistic view of delivery performance not, crucially, where to improve.

We’ve been interested in DORA (DevOps Research & Assessment), for a while. DORA provides a consistent approach to measuring software delivery performance, surveying and reporting the results yearly in the State of DevOps (2021). There are 4 metrics which provide a simple starting point to help measure your technical delivery performance, baseline yourself against wider IT industry and explore a wider set of capabilities to help you influence DORA metrics.

  1. Deployment Frequency, how often do we deploy to production, a sign of a healthy release cadence and use of continuous delivery technique, hopefully adding value and reducing the risk of a stale pipeline and difficult to solve issues in production through large release. We want to evidence frequent delivery.
  2. Lead Time, how long from committing code did it take to get that code in to production, asks questions of your testing approach, continuous delivery and ability to react in the event of an incident or business need. We want our lead time to be fast.
  3. Change Failure Rate, can we deploy our services to production safely on a regular basis, without impacting customers. We want a low failure rate.
  4. Time to Restore, how quickly did we recover an service in the event of a failure, e.g. fix forward, roll back. We want a fast time to restore.
  5. Reliability, more on this later and the newer of the metrics, it focuses more on availability and typical “Golden Signal” metrics, something we’ve been doing for a while. We want something that is reliable and observable.

Initial Survey

Initially we used the DORA Quick check questionnaire to be able to collate data quickly across a wide range of services within the John Lewis Partnership. This was an experiment to get some feedback without developing a complex technical solution across the organisation. We have lots of different approaches across our estate, from leading edge digital platforms for our e-commerce and data insight services through to legacy mainframe services. We captured the data in a Google form and then aggregated the data across the responses.

What this quickly gives you is a maturity rating across your technical landscape, ranking you from Low to Elite in terms of delivery performance. More importantly it also gives you insight in how to move to the next level of maturity. It also helps to showcase the success of our digital strategies in creating platforms and the cultural change towards a more agile, product focused organisation.

Automating Data Gathering

We’ve also now automated the collection of DORA metrics on the John Lewis Digital Platform. We’ve collected data from our Service Management tooling, ServiceNow and also Deployment tooling, Gitlab. This data is aggregated within Big Query and presented back through Grafana (an example you can see below).

DORA Technical Landscape

Within Grafana we’ve created an overall dashboard with a league table on the performance of all our services, along with a dashboard per service (such as for our Product Catalogue, or Checkout). You can then use the time series capabilities of Grafana to evidence the metrics over different periods of time.

A Typical Service Running on JLDP — Catalogue

Metrics Calculation

We have used the data we have to present our own interpretation of the DORA, using data which is used more widely in the organisation.

Deployment Frequency is calculated as the number of deployments logged through our deployment api.

Lead Time to Change is the number of days between the time of the deployments start and the oldest commit contained within the release. We find the oldest commit by comparing all commits between this release and the previous deployment and grabbing the oldest committed date.

Change Failure we examine the Service Now change records for any state other than “successful”. Any other state is then classified as a failure.

Time to Restore looks at incident data in Service Now and compares the resolved and created dates along with status and assignment group to determine a duration.

Survey Results

The survey enabled us to get a quick view across 60+ services within the organisation. It evidenced progress in our strategy towards cloud adoption, use of continuous delivery techniques and a build it/run it approach to delivery.

Overall the Partnership historically tended to be Medium paced in terms of delivery, especially in some of our older services where pace of change is less important or complex tightly coupled architecture makes improvement difficult.

Responding to the need for change we’ve built our digital platforms establishing and improving the capabilities described below and they have significantly step changed our ability to deliver change through the Digital Services running on them.

Typical Areas of Focus for Medium Performers

Digital Services Results

The teams building our digital services averaged as High performers. This is where most of our internal Partner engineers work and our differentiators are build on our digital platforms. There are also a number of exemplar teams who manage services which can be described as Elite e.g Search. This just goes to show all the amazing work done by our Partners and suppliers over the last few years.

Typical Result for a Digital Platform Service

This places us well when compared to the wider retail industry. We’ve moved from deploying 10 times a year on our legacy commerce service to over 5000 a year through our new digital services within John Lewis alone. We’ve done this in just a few years through focusing on many of the capabilities detailed in DORA. All this while maintaining a low change failure rate and thriving through peak trading events such as Covid and our key sales periods of Black Friday, Christmas and Summer sale.

DORA Assessment of Retail Industry

DORA recommended we now need to focus on improving technical processes, our learning culture and our management of work in progress. All things we are actively thinking about and progressing.

Digital Platform Potential Areas Of Focus

What we’ve learned

  • Evidencing Strategy — The data supported our ongoing strategy towards cloud, greater organisational agility and adoption of a loosely coupled architecture. It also supported recent strategy review work targeting less mature services requiring transformation.
  • Dependencies — Typically services were slower when tightly coupled with dependencies, restricting their ability to improve the pace of change. This again evidenced our need to move towards a more loosely coupled architecture. The need to change architectural approach being the hardest step in maturing the delivery and one we will need to tackle to become a truly digital organisation merging online and store experience.
  • Data Driven- The survey was too open to interpretation and human emotion in wanting to show success. The automation has helped standardise the approach and made the results consistent. As we widen it to our other platforms and services we will need to ensure the consistent approach to measurement.
  • Data Quality - Our service management system which is our audit record for incidents doesn’t always accurately capture start and resolved dates for incidents. This data helps us correct anomalies. Improving our Paved Road for incident management is a way of improving this data accuracy.
  • What constitutes failure- What defines failure is hard. We don’t currently account for failure through detection of subsequent change or rollback. This might also be fine as we don’t see significant issues with availability

What’s Next

We haven’t discussed the 5th DORA metric introduced for 2021/22, which focuses on reliability. We’ve tracked the data for a significant period so this will be easy to measure as an additional maturity metric.

  • We already provide observability over a service using Googles Site Reliability’s “Golden Signal” dashboards
  • We also started to first measure availability using potential revenue impact to set availability targets for each service

The next step is making these metrics a core part of how the organisation measures delivery performance. We’ve done some great work to capture this data through manual review and also through automation on one of our digital platforms (JLDP). We are going to look at replicating this approach across our Waitrose digital platform and then more widely where possible. While this is useful insight, we don’t expect to set use this with any target in mind — this is a critical point as there are many reasons why you wouldn’t want all services at the same level of performance.

The data is great but if we don’t act on the data then the effort is wasted. Making this data part of our core strategy and moving away from historical service management metrics as a way of measuring our overall performance is key. It’s understood within our Engineering profession but needs to permeate through the organisation both business and technical areas.

— — — — — —

At the John Lewis Partnership we value bringing the creativity of expert software engineers to bear on the challenges of discovering innovative solutions. We craft the future of two of Britain’s best loved brands (John Lewis & Waitrose) in an environment which respects diversity, values personal development and empowers individuals.

We are currently recruiting across a range of software engineering specialisms. If you like what you have read and want to learn how to join us, take the first steps here.

--

--

Rob Hornby
John Lewis Partnership Software Engineering

Lead Engineer within our Technical Profession & Platform Product Lead for John Lewis with a background in retail technologies, software testing and platforms.