Blamelessness

Jan Filipowski
Jan Filipowski blog
2 min readMar 22, 2017

This year’s wroc_love.rb conference brought really interesting disaster recovery story by Sebastian. It was entertaining, but also thoughtful — see slides here. Go ahead — read slides to get correct mindset for rest of this text.

One of the most important aspects of disaster recovery postmortem is blamelessness. You should think about all causes of the incident, but always care not to blame anyone. First because your very subjective point of view may not be the reality, but only small chunk of it. It also encourages openness — you don’t want to hide out any flaw of your system, otherwise you’ll end up with system that has many defects no-one want to fix, because no-one want to talk about them first. Of course there’s also emotional part of this practice — it’s easier to calm down after disaster and focus on merits instead of covering your ass.

But think for a moment how you feel after causing such disaster. How `team-number-1` from Gitlab felt? How anonymous DevOps from S3 team felt? Maybe they’re robotic-professionals, but I’m almost 99% sure that they felt bad. Maybe I have a lot to learn, but after even small disaster I always feel like an idiot. I hate the very situation, I feel that I’m the one to blame — my fingers started the disaster. The disaster may have root cause in quality of the system or process, its impact may be multiplied or divided by some system qualities, but still I pushed the button, enter, whatever. Even with the best intentions I increased the entropy, changed the stable system into piece of crap (even temporarily).

My question for today Is how to act professionally in such situations? I feel it’s perfectly ok to feel responsible for the problem introduced by my action, even if I haven’t enough knowledge or the process was flawed. It’s part of ownership, which I also find very valuable. But an anger, sadness or any other negative emotion shouldn’t lead me to any short-term only solutions. I’m to blame not only because of this very action, but because I didn’t fix some part of system or process yet.

--

--