The many flaws of flow efficiency

Nick Brown
ASOS Tech Blog
Published in
9 min readMar 22, 2023

As organisations try to improve their ways of working, better efficiency is often a cited goal. ‘Increasing flow’ is something else you may hear, with Flow Efficiency being a measure called out as something organisations need to focus on. The problem is that few, if any, share the many flaws of this metric.

Read on to find out just what these pitfalls are and, more importantly, what alternatives you might want to focus on instead to improve your way of working…

Queues

Queues are the enemy of flow in pretty much every context, but especially software development. Dependencies, blockers and org structure(s) are just a few that spring to mind when thinking about the reasons why work sits idle. In the world of Lean manufacturing, Taiichi Ohno once stated that in a typical process only around 5% is defined as value adding activity. There is also the measure of overall equipment effectiveness (OEE), with many manufacturing lines only 60% productive.

More recently, the work of Don Reinertsen in the Principles of Product Development Flow has been a frequent inspiration for Agile practitioners, with this quote in particular standing out:

Our greatest waste is not unproductive engineers but work products sitting idle in process queues.

Many thought leaders, coaches and consultants champion the use of a metric known as flow efficiency when coaching teams and organisations about improving their way of working, but what exactly is it?

What is flow efficiency?

Flow efficiency is an adaptation from the lean world metric of process efficiency. This is where for a particular work item (backlog item, user story, whatever your preferred taxonomy is) we measure the percentage of active time — i.e., time spent actually working on the item against the total time (active time + waiting time) that it took to for the item to complete.

For example, if we were to take a software development team’s Kanban board, it may look something like this:

Source: Flow Efficiency: Powering the Current of Your Work

Where flow efficiency would be calculated like so:

Flow efficiency (%) = Active time / Total time x 100%

The industry standard says that anything between 15% and 40% flow efficiency is good.

In terms of visualizing flow efficiency, it typically will look like this:

Flow Efficiency: Powering the Current of Your Work

In this chart, we can see the frequency (number) of work items with a certain percentage flow efficiency, and an aggregated view of what the average flow efficiency looks like.

All makes sense, right? Many practitioners would also advocate this as an important thing to measure.

I disagree. In fact, I would go as far as to say that I believe flow efficiency to be the least practical and overhyped metric in our industry.

So, what exactly are some of the problems with it?

Anecdotal evidence of “typical” flow efficiency

Now I don’t disagree with the ideas of those above regarding queues being an issue with a lot of time spent waiting. I also don’t deny that flow efficiency in most organisations is likely to be poor. My issue comes with those who cite flow efficiency percentages or numbers, quoting ‘industry standards’ and what good looks like without any solid proof. “I’ve seen flow efficiency percentages of n%” may be a common soundbite you hear — #DataOrItDidntHappen needs to be a more frequent hashtag for some claims in our industry. If we take a few examples near the top of a quick Google search:

I finally thought I’d found some hard data with “the average Scrum team Process Efficiency for completing a Product Backlog Item is on the order of 5–10%” which is cited in Process Efficiency — Adapting Flow to the Agile Improvement Effort. That is until we see the full text:

And then the supporting reference link:

Surveying a few people in a classroom ‘the average Scrum team’.
It amazes me that in all the years of data collated in various tools, as well as our frequent emphasis on empiricism — there is not one single study that validates the claims made around what flow efficiency percentages “typically” are.

Lack of wait states

Now, discounting the lack of a true study, let’s look at how a typical team works. Plenty of teams do not know or have not identified the wait states in their workflow:

In this example (which is not uncommon) — all the workflow states are ‘active’ states, therefore there is no way to calculate when work is waiting, thus flow efficiency will always be 100% (and therefore useless!). Plenty of teams have this where they do in fact know what their wait states are yet have not modelled them appropriately in their workflow.

Impossibility of measuring start/stop time

Let’s say now we’ve identified our wait states and modelled them appropriately in our workflow:

How often (IME fairly regularly!) do we hear updates when reviewing the board like the below:

Expecting near real-time updates (in order to accurately reflect active vs. wait time) is just not practical and therefore any flow efficiency number is flawed due to this delay in updating items. Furthermore, there are so many nuances with product development that making a binary call as to whether something is active vs. wait is impossible. Is thinking through something on a walk active time or idle time? What about experimentation? Even more so think about when we leave for work at the end of the day. None of our items are being worked on, so shouldn’t they all be marked as ‘waiting’ until the next day?

Not accounting for blockers

Keeping the same workflow as before, the next scenario to consider is how we handle when work is blocked.

This particular item is highlighted/tagged due to being blocked as we need feedback before we can move it along in our workflow. Yet it’s in a ‘work’ state as it cannot be progressed. More often than not, this is not factored into any flow efficiency calculation or literature, such as this example:

Tasktop — Where is the Waste in Your Software Delivery Process?

There is no way an item was “In Dev” for a clear, uninterrupted period and therefore it is not a realistic picture presented in terms of how product development actually happens.

That’s NumberWang!

For those unaware, Numberwang is a well-known sketch from the comedy TV show That Mitchell and Webb Look. It is a fictional gameshow where the two contestants call out random numbers which are randomly then told by the ‘host’ to be “Numberwang!”

Why is this relevant? Well, when looking at items that have moved through our process and their respective flow efficiency percentage, all we are doing is playing an Agilists version of the same comedy sketch.

Face ID Login has a flow efficiency of 19% but QR code returns only had 9%! OMG :( :( :( So what? It’s just a numbered percentage— it doesn’t mean anything! Also, look at the cycle time for those items, can we definitively say that one item was “more efficient” than the other? Does this tell us anything about how to improve our workflow and where our bottlenecks are? No! It’s just reading out numbers and thinking it means something because it’s “data”.

The Flaw of Averages

Anyone who has read previous posts of mine will know that any sort of use of average with flow metrics is a way to push my buttons. Unfortunately, the visualisation of flow efficiency often comes with an average of the efficiency for a collection of completed work items, like so:

Using averages with any sort of metrics is a dangerous flirtation with misleading information, and we can see that for a series of items this is quite easy to do:

Three of our five completed items have poor flow efficiency yet aggregating to a single number alludes to (if the close to 40% flow efficiency being “good” anecdote is being cited!) us having a fairly effective process. By aggregating we are losing all that context of those ‘inefficient’ items and where we might be using them as the basis for a conversation around improving our way of working.

What should we use instead?

In theory flow efficiency seems like a good idea, however when you look at all those reasons above that it is simply not practical for teams and organisations to actually implement and put to effective use (without at least being clear they are using flawed data). Proceed with caution for anyone advocating it without those caveats mentioned above!

Thank you to my friend Javier Bonnemaison for these tweets post publication

A better metric/use of your time is looking at blocker data and going after those that occur the most frequently and/or are the most impactful. Troy Magennis has a great tool for this (thank you also to Troy for sharing some thoughts on this piece).

Here are some of the examples we use for some of our teams here at ASOS:

Shout out to Francis Gilbert in our Saved Items team for these!

Which can then be used to reduce the frequency of particular blockers occurring and seeing where you need to next focus:

Shout out to Gary Sedgewick in our PayTech team for this!

This way you’re actually going after the problems teams face, which will in turn positively impact the flow of work. All this is done without the need of some ‘efficiency’ measure/number.

What are your thoughts? Agree? Disagree?
I’d love to hear what you think in the comments below…

About Me

I’m Nick, one of our Agile Coaches at ASOS. I help guide individuals, teams and platforms to improve their ways of working in a framework agnostic manner. Outside of work, my new-born son keeps me on my toes agility wise, here’s me trying to influence him to share his dads interest in sneakers…

ASOS are hiring across a range of roles in Tech. See all our open positions

--

--