Make Your Code Flow Like Water — Speeding Up Software Delivery

Published in

Anamcara Capital

9 min readSep 13, 2023

Gergely and Kent Beck recently published two articles talking about developer productivity in response to McKinsey.

I think they’re required reading for anyone running an engineering org. However, the developer productivity question is typically a loaded question.

It’s pretty rare I ever get asked by early stage CTO’s about developer productivity. I do get asked “what should my org chart look like”. Or “it feels like we’re slowing down even though we’ve added people”.

Understanding the flow of work in your organisation as you scale is the most important job of a CTO.

Profiling

Ever debugged a performance problem on a web app or website? If you have, there’s a chance you’ve used Chrome’s Network Profiler.

The network profiler is amazing at two jobs:

1. Visualising the flow of requests

2. Helping identify bottlenecks

A webpage is made up of dozens or hundreds of discrete components. So when a webpage is slow it could be any number of these components that is causing it.

With those tools, you can now start the process of optimising the performance of the page.

Are there things that need to run in parallel?
What individual tasks need to be sped up?

Using these same metaphors, we can speed up the flow of work in your organisation.

The Production Line

With a team of 3 to 4 people, work is easier. People own discrete tasks. There’s clear accountability. No team handoffs. Faster iteration.

As your team grows past 10, this is less true. Once you have multiple teams, you need handoffs between them. Now you have a production line.

Let’s assume we have a Designer, Frontend Devs, Backend Devs & Devops in this process. Each of them are involved in various different stages.

If we mapped the skillsets, it might look like:

In a scaled up organisation, those could be individual teams.

Inner & Outer Loops

For each handoff & stage you should consider both the inner and outer loops.

The outer loops go between teams. Design is handing off to frontend. Frontend to backend and so on.

There are important inner loops happening as well. An inner loop is within a team or with an individual. Builds. Approvals. Reviews. Testing. Dev & Test environments. Access.

Inner loops tend not to surface in project management Gantt charts.

These inner loops can waste valuable minutes, hours or days. Minutes may not sound like much; but, consider if those loops are happening dozens or hundreds of times a day. They may matter more than the day long loops that happen once a week or a month.

These inner loops are easier to surface in simpler tasks where the work is more atomic. They also tend to cascade to all work types. Any inner loop that exists in a simple code push will typically exist in a more complex one.

So, let’s start by reviewing how work gets done in our teams.

Start with trivial

Find a trivial enhancement that needs to be completed. Pick one that’s localised to a single team and that a single individual can accomplish. Change the colour of a button maybe.

We want to measure how long it takes to push to production. If you don’t have continuous deployment, then measure to the point of being included in the release. Or measure to production but take the release wait time out.

We want to avoid guessing. This is a measurement exercise.
The timer goes when the task is assigned. It doesn’t stop when the developer commits the change. Rather, when the code gets to production.

Make the code change. Commit the code. Approve the change. Run the build & tests. Push a release.

Where are the bottlenecks to that process?

Frequently:

Long and/or brittle build & test
Onerous production release cycles
PR review turnarounds if you are mandating pull request reviews on all changes

Now low complexity

Here, I would look at pieces of work that require handoffs.

An example of this could be adding some new fields on a form in a web app. Little to no design needed. Frontend will have to do some work to add the UI form elements. There might be a database change required to add some fields to capture the data. There might be some validation work to be implemented to sanity check the input.

How long would this have taken you when you were a 3 person team?
How long is it taking now?

It’s important to try and measure here. I would encourage you to take this measurement all the way to production.

How is the work co-ordinated between frontend and backend?
Can the front-end engineer add those changes to the backend themselves or do they need to wait for a backend engineer?
Do you have a migration framework for initiating database changes? Does this flow all the way to production?
Do you have automated testing in place to safely install the changes?

Frequent problems:

Wait times & co-ordination problems between different teams
Lack of T-Shaping of engineers. Team members sticking strictly to their job description means more co-ordination.
Inadequate build & test environments. Reduces safety for making these changes.
No safe way to deploy database changes to production. Someone with production access required to carefully make changes.

Moving to Medium to High Complexity

Here we are more focused on co-ordination problems. How is work getting handed off.

Map out the workflow in a Gantt chart. It can be helpful to see the sequencing. Like our Chrome Network Profiler, we can see where we’re slowing down. Where are the complex steps and which teams are involved.

Where are we over-coordinating?
What skillsets could live in the same team to speed up handoffs (eg frontend & backend)

Bottleneck Theory & Capacity

The throughput of the system is constrained by its slowest component.

Consider a simple 3 team production process. Teams 1 & 2 can process 2 units of work in the same time that team 3 can process 1.

Work flows from station 1 to station 2 and then to station 3. At station 3, we start to see the problem.

Now, work starts piling up.

The system can only produce at the rate of the QA team which is 1 unit of work at a time. The rest of the work just piles up at that station. Increasing stress.

The only way to increase the throughput of this system is to add more resources at the QA station. Adding more frontend and backend engineers doesn’t help. It makes the situation worse. Work will slow down when you add them.

Balancing this capacity as you grow your organisation is critical to avoid burned out teams. It’s also critical to make sure you’re increasing throughput and not decreasing it.

A few more thoughts here:

Not having the QA station at all is one answer. Lots of startups and modern software companies have moved away from traditional QA teams for exactly this reason.
Allocating resources into the same team also helps. The co-ordination problem can be solved in the team much more easily that it can between teams.
Make your team members more fungible. Can the developers move laterally across and help with QA resourcing when needed?
Increase the effective throughput of a station by making it non-blocking or only using it in certain circumstances. Maybe the QA team above is only utilised for certain complexity features. Or maybe it sits to the side and it isn’t a gate for production.

Feature Teams

Feature teams can really help here. They don’t eliminate co-ordination problems. They do localise them to the team level. That is much easier to control.

The further apart two teams live in the organisational chart, the greater the effort to co-ordinate.

It’s easier to align two team members who have the same manager. It’s easier to align two managers who have the same director.

This is a variation on Conway’s Law

Organisations will design systems that mirror their own communication structure — Conway’s Law

Break Down your Work aka Reduce your Batch Size

Breaking down work into much smaller pieces keeps the system flowing. It also reduces the overall time spent on a piece of work.

A lot of the theory here comes originally from Queuing Theory. See Little’s Law. It made its way in to lean manufacturing and is discussed substantially in the book The Goal.

It is somewhat counter-intuitive. Many engineers will push back on contrived boundaries on a piece of work. “It doesn’t make sense to break this down any further”.

1. Larger pieces of work are harder to reason about. That means gigantic pull requests take much longer to review. Large feature surface areas take longer to QA.

2. Larger work items cause idle and wait times for the system overall. Smaller pieces of work can be processed more quickly which increases the overall capacity and utilization.

3. Larger estimates in the order of weeks or months are harder to interrogate. It’s much harder to be accurate when you’re estimating these huge chunks of work.

4. Success in software product is accomplished by iteration. Breaking the work down in to smaller chunks means less rework to do.

Most of this is covered in agile methodologies better.

Feature flags are a great way to cordon off unfinished features from production end-users and still keep the assembly line running all the way to production.

The tactics

Don’t rush to solve problems that don’t exist for you. The exercise above can help you see where you currently waste time.

Here are some of the more common solutions I see and what problems they intend to solve.

Version control with a Feature branch / PR workflow — Allows developers to more safely work on similar areas of the system at the same time.
CI/CD Practices — Regularly merging code to prevent integration & build issues. Deliver to production regularly to smooth out delivery flow.
Feature flags — allow smaller batch sizes of unfinished product to make their way through your pipeline. Pair with CI/CD.
Containers as artifacts — Simplify dependency management. Keeps consistency through environments and reduces variance.
Develop T-Shaped skillsets — allow engineers to work across more tasks. Allows co-ordination to be minimised.
Feature teams / Organising your teams to minimise dependencies — Removes co-ordination time
Organisational chart design. For larger orgs. Place teams closer together within a single org unit that collaborate heavily together. Allows a single leader to own the outcome.
Break your work down — Most Agile methodologies encourage this. You don’t need to be using any particular process to use it. Plan your work. Break it down in to small chunks (say less than 1 week) and deliver them.
Focus on Developer Experience (DX) including automated testing, builds & environments. Use your analysis above to determine what’s most impactful for you.