I deleted a production database

And learned a few things along the way

Adrian Hornsby
The Cloud Architect

--

A stormy day at work

It happened more than a decade ago. While I might have forgotten some technical details, I remember how I felt like it was yesterday.

“I am done. It is the end, and I will be fired now.”

These were the words I kept repeating to myself.

I had just deleted a production database critical to our customers.

It was late evening, and I had been debugging an issue with one of our services, and I was ready to call it a day and go home.

All of a sudden, monitoring alarms started to go off. Requests were timing out at an unprecedented rate.

Our customer, already on the phone with the leadership team, could have been happier.

We had told them that increasing the load on our service by a factor of ten overnight wouldn’t be an issue.

And why would it be an issue? Everything worked perfectly in the test environment.

Yet, there I was, staring at the logs.

HTTP status 504: Gateway timeout server response timeout

HTTP Status 504, or Gateway Timeout error, typically means that the web server didn’t receive a timely response from an…

--

--

Adrian Hornsby
The Cloud Architect

Principal System Dev Engineer @ AWS ☁️ I break stuff .. mostly. Opinions here are my own.