Defect management in Agile: try the andon cord method from lean manufacturing

There are lots of tools that proclaim to help manage defects within your software. We’ve tried a few of them (JIRA, Pivotal Tracker) and I’ve personally been involved in projects where a third party has been in charge of testing and QA. And I don’t like either of them one bit.

Defect management can become a complex business if you let it, but I believe it should be pretty simple.

Several years ago we became frustrated in our failure to control the quality of what we delivered and the corrosive impact this was having on relationships with our clients and customers. Retrofitting quality is hard, frustrating and expensive (AKA: horse — bolted) so we’ve found it’s best to have a method for minimising this situation occurring in the first place.

When we were faced with building a real-time trading platform we learned very quickly that we needed to find a different way of handling defects that resulted in far fewer being released into the wild — it needed to be a pretty much zero-bug deployment but we wanted to maintain the ability to deliver improvements and features quickly and iteratively. We didn’t want to fall into the trap where fear overcame ambition and we placed walls between production and deployment.

The prospect of creating the most complex and risky application we’d ever built, releasing it to several big banks and then iteratively deploying new features helped us to focus our minds: it was this do or die situation that motivated us to getting a method that worked.

Defects belong to stories

In my experience, as soon as you decouple defects from a user story you’re doomed. I don’t care how you manage it, what company you use for testing and QA, or what bit of defect management software you’ve used. You’ve failed because the relationship between the defect and the story has been removed and already you’re going to be fighting for resources to get the defect fixed and your developers will be reacting badly to yet another long list of bugs to fix.

The first line of defence is attack

Defects in production reduce confidence in your product/service, cost more to put right and can be demoralising for everyone involved. Therefore, the easiest way of reducing the pain is to ensure that fewer defects ever make it into production. If you follow a user story based method where your stories flow left to right (such as in Kanban), there’s little reason why you shouldn’t be able to minimise defects given the right approach.

The andon cord

Toyota developed the andon cord method in lean manufacturing whereby if a defect were to be detected on the production line, an employee can pull a cord or push a button to stop production so the defect can be addressed immediately. The aim was to give the worker the ability, and moreover the empowerment, to stop production when a defect is found, and immediately call for assistance.

Applying the andon cord to Agile

The principle is quite simple: if an issue occurs when testing, the tester has the empowerment to communicate the defect to the developer and the defect becomes the focus of attention until it’s rectified. As Ian Carroll succinctly puts it:

“The Kanban approach is:

1. Chuck a single story into test
2. Tester finds defects, grabs developer by scruff of neck and says “WTF!”
3. Developer immediately stops what they’re doing and works with the tester to resolve the issues. No defects are logged. No red cards are created
4. Automated tests are updated
5. All focus is on getting the story into an acceptable state”

In our case, we make use of a series of Trello stickers to indicate defects so it’s blindingly obvious when a defect has occurred. The card also gets pulled into Up Next from Testing, the defect is recorded on the story card and the tester (or whoever found the defect) grabs the developer to explain.

Not all defects are equal

Once defects have made it into the wild — and it does happen, no point pretending otherwise — we treat them as any other new requirement, starting as a card in the backlog, being refined as necessary, prioritised by the product owner and making it through to our sprintboard whereby they’re worked as any other user story is worked on, going through the same process, testing and QA.

The important thing about this method is that it enables product owners to understand that not all defects are born equal. For instance, some defects have a negligible impact on operation or user experience and may well be determined to be a lower priority than spending time on shipping a new feature. Whereas if you just create a defect on the sprintboard then it’s automatically deemed to be higher priority than anything on the backlog. And actually, it might be that more value can be generated with the same effort if a new feature is developed, rather than fixing an insignificant defect.

Whatever you do, hang your dirty linen out for all to see: bugs/defects do not belong in secret places or in one person’s head, they are forms of debt that may or may not need repaying in the end.