What DevOps Can Learn from Scaled Agile

Cost of Delay

One of the hardest things a DevOps person can do is try to lobby with the Business Stakeholders for their Product to actually invest in the Build and Deployment architecture. All too often investments in internal structures is cut/slimmed down in favor of investments in features customers can actually touch and see. That means the DevOps is chronically under-invested unless (a) a CTO can find a secret pile of money to separately fund DevOps investments or (b) the Product hopelessly crashes badly enough that the problem is (finally and sadly) too big to ignore.

Here is a trick from the Scaled Agile and Lean Thinking advocates that DevOps people would be well-advised to learn to speak : Cost of Delay.

Quite simply, the Cost of Delay is the estimated cost of NOT making an investment. Rather than traditional Cost-Benefit ratios which are all-too-often gamed and tilt the scale towards potential revenue-generating features, Cost of Delay helps balance that equation more easily.

First, a cup of accounting reality for my DevOps friends:

  • It take $10 of cost reduction to equal $1 of revenue increase. That is just math. Revenue is the numerator and costs are the denominator. Therefore, you need to come up with roughly 10x the cost reduction $’s in order to be in the same trade-off conversation with a potential revenue-generating feature
  • Don’t get elaborate with your cost estimates. Find out what is the standard daily cost per human at your company and use that. I typically see $10,000 per day for North America and $8,000 per day for typical offshore countries.

Second, a nod to the Masters

How to Speak Cost of Delay (Summary)

Values — For ease of estimating, rather than applying dollars or work hours I highly recommend you use a simple modified Fibonacci sequence (1,3, 5, 8 or 13) where the lower the number the lower the value/cost/estimate.

These are the primary components when calculating Cost of Delay:

  • User and/or Business Value — What is the value the customer (which could be an internal customer) places on this Epic/Feature?
  • Time Criticality — What is the decay rate of this Epic/Feature? How much worse does our problem get the longer we delay?
  • Risk Reduction or Opportunity Enablement — In product feature terms, will our customers wait for us to develop the Epic/Feature or will they move to another solution? Does this Epic/Feature reduce the risk or variability in delivery? What other value or data will we get from this effort?

Examples Using Common Development Architecture Initiatives

Example 1: Invest in an automated Build Pipeline to reduce manual interventions

Setting up and configuring an a build automation pipeline to take a semi-automated build process and eliminate 80% of the manual steps and reduce build time from 4–10 hours per build to a rough average of about 10 minutes. Given the pace of change in the current code base and the growth in number of developers, Time Criticality grows exponentially from a 3 in 2 months to 8 in 4months and 13 in 5 months. The impact of mis-typed manual steps also grows exponentially as the number of developers and testers increases, customer-driven feature delivery dates loom and recovery from missteps takes longer.

Example 2: Invest in Automated Test Environment Setup to reduce setup time

The team currently maintains 2 test environment profiles. Each environment reset requires 2–4 days of setup time and therefore can only be done once or max twice during a Release lifecycle. During test environment “refresh” the entire environment is unavailable for testing. The impact degrades only slightly with time as the number of developers and testers grows at less than 5% per year. The team releases product to customers once every 6 months and that cadence is unlikely to change in the next 12 months.

Prioritize using Cost of Delay Divided by Job Size

Now you can create a numeric prioritization framework by creating a score for each Epic/Feature. That score is derived by calculating the Cost of Delay and then dividing the CoD by the Job Size. To greatly simplify matters, use a Relative Story Point estimate for the Epic/Feature as a reasonable proxy for Job Size.

Some other tips about using Cost of Delay

  • Exclude Sunk Costs. That is, don’t include how much you’ve already invested into a particular framework as part of your thinking or justifications. Proper economic models for evaluating investments (even personal financial ones) exclude consideration for how much you’ve already spent.
  • Express the CoD as a Range over Time — Resist the urge to normalize to a specific time period (e.g., 6 months). If your CoD is actually flat over time, then that tells you a LOT about the risk of delaying until later (which is , it’s really fine and therefore not a high priority item to invest in). High priority items should show at least a linear growth in CoD if not exponential.