Don’t burn out, burn down: how we learned to sprint on shifting sands
When a Skyscanner squad were tasked with a project involving shifting, hard-to-predict objectives, their sprint completion rate dropped to 0%. In this essential post, squad leader Cristiano Balducci recounts how they fought their way back to 100% — and iterative delivery.
Here at Skyscanner we are arranged into squads and tribes; I’m a squad lead, and my squad — like many of the other engineering squads at the company —uses an ‘agile’ methodology called ‘Scrum’ for the purposes of effective software development. One of the characteristics of the Scrum process is that clearly defined goals are completed in clearly defined time periods called ‘sprints’.
At the time our story starts, my squad had a problem: we were working hard, but failing to achieve our goals and thereby failing to deliver value at the end of each sprint.
This blog covers our efforts to troubleshoot our processes — and will hopefully help readers put their own measures in place to deliver on their goals in a more focused and effective manner.
We will look at our journey through the lens of the ‘burn down chart’.
For the uninitiated, a burn down chart is a tool which is often used by teams employing agile development methods — and visualises how work is completed over time; burn down charts can be useful for predicting the likelihood of work being completed or, as in this blog post, for diagnosing problems with the execution of tasks and projects.
In terms of burn down charts, this blog post explains how we went from working hard but achieving very little of value…
…to iterative delivery:
Before we get started, though, a disclaimer: I am not suggesting that the burn down chart is the be-all and end-all of squad efficiency - it is a metric, a useful focusing tool to explore how much a squad:
- understands its own work
- has focus
- can stretch its delivery
Now that’s understood — we can begin:
Part I — Flatlining
This was what our burn down chart looked like at the beginning of our sprint troubleshooting exercise. We were flatlining — as I wrote above, we were working hard, and yet failing to complete any of the tasks that were supposed to comprise the sprint, nor delivering value at the end of the week. A pattern like this is not uncommon and usually means a team is having issues in correctly specifying work. But why were we having those issues?
One issue we identified straight away was this: our sprint planning sessions were taking too long. We tended to address both goal setting and task-estimating in a single session. As a result, planning sessions were mentally gruelling; after a while, planning fatigue would set in, we would run out of the mental energy required to discuss the tasks in front of us, and end up committing to work we didn’t fully understand. What we often ended up with was a series of vaguely defined and somewhat decontextualised tasks, and a sprint goal which felt unachievable.
Solution: separate the what from the how
The first thing we did was decouple the process of setting the sprint goal (the ‘what’) from the work of planning the sprint (the ‘how’).
We moved to a set-up where the product owner, squad lead and senior engineer in the squad would get together during the ‘current’ sprint and specify the sprint goal for the next sprint. Then we inserted a ‘spike’ task into our sprints; this spike was dedicated to creating the list of tasks (or ‘backlog’) we would need to work through to achieve the goal the PO, SL and senior engineer had chosen. This ensured that sprint planning took less time and the tasks created through the planning process were fully contextualised.
In Agile software development, a ‘spike’ is a special type of task included in a sprint for the purposes of investigating something — e.g. a new piece of technology, a thorny problem, a proposed approach. Spikes usually involve investing one person’s time up front in order to save the whole team time later on.
Part II— Hitting ‘the wall’
After the previous change in our ceremonies, our sprints started to look a bit healthier… in the first half of the sprint at least.
In agile development, ‘ceremonies’ is the name given to a group of important meetings that need to happen to facilitate work
The above pattern usually appears when all the tasks in the sprint are blocked. When this happens most teams start to to pull tasks from their backlog in order to stay utilised — and morale takes a hit, too.
One of the ways we reacted to ‘hitting the wall’ was to move to 2-week sprints. Perhaps, we thought, we just needed more time to climb up and over the wall in our path?
Increasing the iteration length is a common reaction to planning/efficiency failures in Scrum. While a longer sprint length is not necessarily bad in itself, it is important to recognize that widening sprints has the potential to hide problems (by removing tension from the system) rather than resolve them.
Solution: anticipate to iterate
With some of the tension removed from our situation, we were able to recognise that we had a problem with the interdependent nature of many of our tasks.
Instead of trying to remove all the dependencies between tasks (which would be ideal, but is seldom feasible) we approached the problem as a timing issue. Looking at it in this way, we decided that the logical thing to do would be to anticipate the blocking task(s), and address them in a separate sprint from the blocked task(s).
To achieve this we instituted ‘buffer sprints’ between our current sprint and the sprint we were ‘spiking’ (i.e. researching).
This enabled us to anticipate and address blockers in enough time, as follows: one of our engineers would look at the sprint goal for the ‘horizon sprint’ and create the backlog (i.e. the list of tasks that, when completed, would mean the goal had been achieved; the actual work to be done during the sprint). Crucially, if they discovered one task that was likely to block all the others, they would ‘pull’ that blocker task earlier so it could be addressed during the buffer sprint.
As a result we created a ‘planning horizon’ process — and a planning horizon meeting became one of our ceremonies:
The concept behind our process is called rolling wave planning, and is used in project management to do adaptive planning for project milestones.
Part III — Skylining
We now had a planning horizon — and our sprints were more productive — but we were still missing our goals:
The ‘skyline pattern’ in a burn down chart is tricky to read. It can be caused by a small scale ‘hitting the wall’ situation (see Part II, above). It can also be caused by mismatch between a sprint goal’s acceptance criteria (AC) and the sprint backlog. Finally, it can be caused by a change to the sprint goal which has come about because of external circumstances. It took us a bit of time to understand what exactly was not working for us, and what we discovered was this: it was a combination of those three factors.
Solution: structure with science
We decided to try and mitigate this three-part problem by structuring our planning in a more scientific way.
We instituted a short weekly recurring meeting to discuss the next sprint goals and we made sure that part of this discussion covered the risks that could mean our sprint goals would be devalued. Having the risks explicitly stated allowed us to address them, and reduced the risk of us having to pivot mid-sprint due to external circumstances (the ‘shifting sands’ mentioned in the title of this post).
We made the creation of explicit ‘blocking vs blocked’ graphs a standard part of our sprint spiking process. Having a graphical canvas helps us discuss the blockers, and differentiate between “start blockers” (blocks the start of another task) and “finish blockers” (blocks the completion of another task).
In addition to this we modified the way we specified tasks. It became standard procedure, when creating a task, to explicitly express the steps needed to complete it. This is a great tool for reasoning on blockers. What we found is that — more often than not — you can identify the specific step that is blocked and break it out as a separate task, resulting in more independent tasks. Using these methods we are able to avoid small scale ‘wall’ patterns.
Finally, we now explicitly validate the sprint backlog against the sprint goal’s acceptance criteria during our planning session. This simple step avoids the necessity of pulling in additional work during the sprint just to satisfy the sprint goal itself.
Part IV — Down and to the right
After this journey, here’s where we are today:
We consistently meet sprint goals. We get there iteratively, not by delivering everything on the last day of the sprint. Our delivery is much more predictable, and our sprint demos are no longer stressful.
We are back to 1-week sprints. Shorter iterations make this process more effective by limiting the scope of sprint goals and the distance between the current sprint and horizon sprints.
Part V — Timelines
It’s time for full disclosure: getting to this point took us some time - a lot of time, in all honesty. Introspection skills are not the easiest thing in the world to master. Nor is being open-minded and accepting of constant change.
The timeline for moving sprint completion from 0% to 100% was in fact 29 weeks and 4 days. However, that doesn't matter, because the journey in terms of reflecting, changing, learning, adapting and iterating is our real success story. We are now a significantly stronger, more accountable and higher-performing team than we ever were before. How many teams do you think could undergo such a positive transformation in only 29 weeks and 4 days?
Part VI — What’s next for us?
While thinking in terms of a planning horizon has proved effective, one thing we are considering is how we can make the whole process less time-hungry. As of now, the cost is between 1 and 2 developer days per sprint. In our squad this amounts to roughly 5–10% of capacity. We think it pays back, but it would be interesting to see if we can get the cost down.
Another thing we are investigating is our upper limit of sustainable work. We plan on turning up the commitment dial gradually, looking for the sweet spot where we can comfortably do the work and have time to learn and still be challenged.
Finally, we have just started adding annotations to our burn down charts to track when we are meeting the sprint goal acceptance criteria. We believe that this can give us some additional insights on how effective our delivery is.
This process is the brainchild of my whole squad and various other fantastic people at Skyscanner. Thank you all!
If you’ve got any comments or feedback for me please dive into the comments below — I’m always interested to hear about the efficiency challenges other teams have had and always willing to share ideas 👍
Join Skyscanner, see the world
Life-enriching travel isn’t just for our customers — it’s for our employees too! Skyscanner team members get £500 (or their local currency equivalent) towards the travel trip of their choice in 2019 — and that’s just one of the great benefits we offer. Read more about our benefits and have a look at all of our open roles right here.
About the author: Cristiano Balducci
Cristiano is passionate about lean, agile, DevOps and solving efficiency problems (for both teams and software). He has worked mainly in operations automation in the past and is currently squad lead at Skyscanner. Cristiano is a compulsive reader, almost retired boxer (one is never completely retired), and all-round joker.