Jean-Michel has broken the prod… again!

Or how to play down with failed productions

--

The way we handle releases evolves a lot during our career. When we start our developer life, and for some us even long after, pushing the big red button can be a little scary.

What if something goes wrong?
What if our code is not that perfect?
What if the whole site goes down because of us?! 😱

You have probably asked yourself these questions at least once, and maybe it’s still the case today. But… Why?

Extrait de la scène d’introduction du jeu “Sonic Advance 3” dans laquelle on voit le personnage d’Eggman appuyer sur un gros bouton rouge.
Why not having Eggman’s fluency when it comes to press the big red button? (introduction scene from “Sonic Advance 3”, Game Boy Advance)

The first reason is quite obvious: professional conscience. We want to do our job properly, and breaking production is the antithesis of this goal.

That said, this fear of releases seems to be more often present among junior developers. Does that mean seniors are less conscientious?
Of course not, but it would be a bit candid to say that seniors are simply more sure of themselves: more experience in the industry, meaning more sure of their skills, therefore less fear to push into production. This is undoubtedly true, sure, but it’s also way more complicated than that.

In addition to professional conscience, two other points can be scary when we break the production:

  1. the eyes of colleagues
  2. the repair process in case of emergency

Let’s take a few seconds to think about it… These two points should never be something to be afraid of, right?

There’re so many things to legitimately be afraid of… (from the first movie trailer of “Sonic the Hedgehog”)

Do you trust your colleagues?

When the code we wrote breaks something, we can’t help but blame ourselves. This is our code, so it’s our fault. Right?

Wrong!

If your company follows the industry standards, you are probably not the only contributor of the code you push into production, even if you wrote it.

Perhaps this code has been the subject of a technical analysis. Perhaps you wrote it with a fellow colleague during a pair programming session.
You have most probably opened a pull request, where your code has been reviewed. Quality Engineers have tried to break everything while testing on staging environments, while automated tests must have checked a lot of things on their side.

In short, the point here is you are not the only responsible of the code you deploy into production: this is a team matter!

When a bug is found in production, it’s never a one-person fault: the whole team has failed somewhere.

It’s true that developers like to joke on each other. We love to say that “Jean-Michel has broken the production again”. We like to make fun of Jean-Michel, it can play down the situation while reassuring ourselves. But beware.
Maybe Jean-Michel is OK with that and happily laughs with you, but it can create an unhealthy atmosphere in your team despite of you. Developers unsure of their skills might be scared of finding themselves in Jean-Michel’s shoes.

Blaming each other does not effectively fix bugs. Work as a team! (from the movie “Sonic the Hedgehog 2”)

If joking on the subject is a good way to make people understand that no one dies because of a bug, we are not all sensitive in the same way. Just remember that.

The important thing is to set up a climate of trust in your team. Everybody should feel comfortable with making mistakes.

As we say at OpenClassrooms: “It’s OK to fail”!

But keep in mind that breaking production is not trivial: it is not OK for your site to be unavailable for too long. It can have a big impact on the credibility and reputation of your product. In order to be comfortable with making mistakes, it’s required that fixes are easy and quick to apply.

So comes second point: your process to repair in an emergency.

Do you trust your rollback process?

When a release breaks something, it’s important to be able to react quickly and efficiently. What to do must be clear to everyone: notify the team, launch the rollback deployment, fix the faulty code… Whatever your process is, it’ll be followed in a hurry: that’s why it’s required to have a good documentation about it. Everybody must be comfortable with it!

If your team knows precisely what to do in case of production issues, the fear of breaking something will naturally decrease. Fear of the unknown is part of the most common ones, so make sure your rollback process is not scary!

Playing down with failed productions is possible if, and only if, the rollback process is clear, efficient and not stressful.

At OpenClassrooms, we trust our process so much we don’t hesitate to deploy on Friday… What about you?

A rollback going well, allegory (reversed gif from game “Sonic the Hedgehog”, Mega Drive)

Well, OK, you broke the prod… So what?

Just as a baker can burn a bread, or a painter can miss a brushstroke, a developer can write broken code.

It happens, it’s part of the job!

Come on, don’t blame yourself like that… (“Sonic Colours”, Wii)

For the most junior among us, please be aware that we all have made mistakes. Whether it’s me, this dear Jean-Michel, this colleague who impresses you, or even your CTO… We’ve broken an application in production at least once.

On my very first job, I sent a test email to all the clients of the company: using the production database instead of the staging one is a classic mistake and I made it only once in my life, trust me!

Another example, this time at OpenClassrooms: my own manager had broken the public API route serving our learning paths. And guess what? He’s now the engineering director of the company!

Just because your code broke something doesn’t mean your career is over.

But please don’t twist my words: breaking production is something to avoid. Making a mistake once in a while isn’t that bad, but repeating this same mistake is problematic.
Please remember the following:

The good thing about making mistakes is that you learn from them.

So learn, learn while you can!

And be nice with Jean-Michel 😉

And now… go push this big red button! (from the movie “Sonic the Hedgehog”)

--

--

Adrien Guéret
Product, Experience & Technology @ OpenClassrooms

Front-End developer, working at OpenClassrooms. Also Nintendo enthusiast :)