Pre-Mortem: Working Backwards in Software Design
What is a pre-mortem?
Pre-mortem is a strategy in which a team imagines that a project has failed, and then works backward to determine what potentially could lead to the failure of a project. The term was coined by cognitive researcher Gary Klein.
Unlike a post-mortem or root-cause analysis that is performed after things have failed, a pre-mortem is done before the start of the project. It uses “prospective hindsight” to help make better decisions by working backward and eliminate thinking biases.
This blog is about a customized version of pre-mortem that PayPal’s engineering team adopted last year. This strategy highly benefited our team and we are excited to share our story with the greater technology community.
A twist in technical design review sessions — poke my design a.k.a pre-mortem
Once a technical design is documented, the standard next step is to have key stakeholders review the design.
The pre-mortem strategy asks us to flip that script, and ask what if the proposed design implementation failed. The next step in this strategy is to brainstorm with your team on the possible reasons for the technical design to fail. A main point to keep in mind is to get creative with your team. Come up with as many ideas for failure as possible. Finding faults for the greater good is a liberating exercise, especially when initiated at the beginning of a project. The aim is to bubble up the most robust design alternatives. It is key not to sweat on the solutions yet, just the problems. Brainstorming solutions can come later as a team.
When my team completed this simple exercise, it brought the most diverse perspectives to light. It was a joyful game of throwing darts at the initial design to kill it. What makes this exercise powerful, is the fact that the team still has control to fix the problems. After seeing the benefits of this process, our team started implementing this exercise regularly before we started coding.
Here’s how our technical pre-mortem looks
As a platform team, we work on critical backend risk decision services and often collaborate with other teams. Here is a simple three step pre-mortem framework we refer to when working on sprint stories:
- An engineer writes up a one pager for the user story. It articulates what problem are we solving for our end user, why we need to solve it now, how we’re going to implement the solution (flow/sequence diagram) and test it. The testing strategy is a good indicator on how well we understand the problem or solution. Can we identify the impact of our changes?
- Next, conduct a pre-mortem within your team to discuss all the ways this design could fail to catch gotchas. For example, is this scalable, is the data available to make the decision or do we need an API call, will the additional call meet Service level agreement(SLA) or latency requirements?.
3. Address issues and refine your design.
Important: Ensure all key design decisions and tradeoffs are recorded in this distributable online one-pager. It becomes the source of truth for development. This avoids losing important knowledge in multiple email chains.
Also, if your user story spans work across multiple teams, then conduct a pre-mortem with architects and partner with team tech leads too. This may reveal if the design is not aligned with long-term architecture, if there is duplicate effort or a better way to do this, etc. These are costly mistakes if not found sooner.
Benefits of the pre-mortem approach
The biggest benefit is the ability to catch issues early on and expose our blind spots, but there’s more to it:
1. Pre-mortem encourages everyone on the team to see the big picture and then work backwards to create the best solution. For platform teams, this is not very straightforward as our code sits deep inside the stack where it does not directly interact with our customers like a front-end application.
2. It breaks down silos and relies on the team’s collective intelligence and imagination. If the problem or solution is unclear, it is evident right at the start and reduces churn later.
3. Creates an environment of psychological safety when we normalize talking about failures. There are NO stupid questions.
4. It’s a great way for tech leads and managers to mentor and coach new and less experienced members on the team. The pre-mortem sessions can provide learning opportunities to understand how our services interact with the rest of the PayPal ecosystem and plans for long term architecture strategy and non-functional requirements.
5. Increases team engagement and participation by avoiding a top-down approach where an engineer passively works on a design handed to them.
6. The one pager is a source of truth that can be referenced for code reviews, functional tests, and end-to-end or integration testing.
What to watch out for
1. If we catch ourselves going into “analysis paralysis” with more iterations of the pre-mortem, then there may be something unclear about the user story or its scope. It’s a good time to circle back with stakeholders to clarify scope and boundaries.
2. Actively solicit diverse opinions right from the start of the pre-mortem discussion. This is true for any meeting. Here’s Harvard Business school professor Frances Frei’s suggestion — “Diverge before you converge” ; an excerpt from how to run meetings:
One way to make conversations both more productive and shorter is to abide by the principle, “Diverge before you converge.”
“Whenever one person speaks in a meeting and they’re going to give a point of view, if left unaided, the next person is likely to give a similar point of view, and the next one a similar point of view, until you fill the allotted time,” Frei says. Not only does this waste time, it can pave the way to bad decisions because people don’t feel comfortable breaking with the group.
Instead, she recommends that meeting facilitators step in after the first person gives their perspective. Then ask, “Can someone articulate a different point of view?” After the next person speaks, see if anyone is able to offer yet another distinct point of view. This way, Frei says, “meetings will go faster and be of higher quality.”
Pre-mortem makes working backwards a happier dance :)
Try using the pre-mortem technique in your teams. We may not be able to eliminate all mistakes, but pre-mortem normalizes talking about potential failure scenarios, encourages us to ask questions and be curious. “What can we do better this time”?
Below are references to read more about pre-mortem. Feel free to share feedback in the comments.
By Gary Klein — https://hbr.org/2007/09/performing-a-project-premortem
Here’s an article about misuse of pre-mortem: https://capitalallocators.com/wp-content/uploads/Klein-Sonkin-and-Johnson-2019-The-Misuse-of-Premortems-on-Wall-Street.pdf
Nobel laureate Daniel Kahneman on pre-mortem: https://fs.blog/2014/01/kahneman-better-decisions/
Mckinsey report: https://www.mckinsey.com/business-functions/strategy-and-corporate-finance/our-insights/bias-busters-premortems-being-smart-at-the-start