Incident Postmortem Template

Towards Operational Excellence

Adrian Hornsby
The Cloud Architect

--

I’d like to express my gratitude to my colleague and friend Arni Birgisson for his valuable feedback.

Since I published my blog series Towards Operational Excellence, I received a relatively large amount of feedback. But one question, in particular, stood out.

“Can you share an incident postmortem template?”

In this blog post, I will share an example incident postmortem template, which I hope will help you get started. I will also share some DOs and DON’Ts that I have seen work across a wide variety of customers — both internally in Amazon, and externally.

What is a postmortem?

A postmortem is a process where a team reflects on a problem — for example, an unexpected loss of redundancy, or perhaps a failed software deployment — and documents what the problem was and how to avoid it in the future.

“Postmortems are not about figuring out who to blame for an incident that happened. They are about figuring out, through data and analysis, what happened, why it happened, and how it can be stopped from happening again.” —…

--

--

Adrian Hornsby
The Cloud Architect

Principal System Dev Engineer @ AWS ☁️ I break stuff .. mostly. Opinions here are my own.