You made a mistake. Now what?

Juan José Ramírez Calderón
Tekton Labs
Published in
6 min readJan 17, 2019

I accidentally dropped and recreated an entire database schema. That means all table rows of data in the schema were lost.

Wait, what? How could you? Were you fired? Are you in jail? Did you forget the WHERE in the DELETE FROM?

This is no horror fiction but a true story. I made a mistake and I can’t change the past. But perhaps this could be useful for you. My dad always told me:

A wise man learns from his mistakes, but a wiser man learns from the mistakes of others.

About the project

The project where I made the mistake was complex and interesting — it had its own database, but also interacted with remote client databases. Tekton Labs offers a team augmentation service, so a co-worker and I were working remotely with a full team of developers.

The task at hand

Some client databases are missing a column in table X, in schema Y. In that case, alter table X to have the required column. There are other clients whose database is missing the whole schema Y. Create schema Y with all its tables with our SQL script.

They provided me with several connection strings of different database servers. Some hosted more than one client database. In total, I had to check and make proper changes to a list of 15 client databases.

What I did

With the credentials they gave me, I connected to all DB servers. Then, one by one, I opened the right database and schema and figured out what task should be performed in each database: run creation script or run modification script.

I had both scripts in my query editor page and ran one at a time. First database, done. Second DB, done. And so on. As I finished each database, my confidence increased. As my confidence increased, so did my speed. After a while, I was performing the tasks like a machine — totally immersed in the flow.

Immersed in the flow

I made a mistake

Guess what — despite feeling like a well-oiled machine, I’m still human. In the 10th database, I mis-clicked the dropdown where you select the database. I was so immersed in the flow that I performed the actions of the 10th database on the 9th database instead. The 9th required modifications only while the 10th needed the creation script. When I performed this action, on the console output I read ‘TABLE DROPPED’ multiple times.

I had barely read the creation script. I opened it again, and as I imagined, before creating the schema and tables, it first dropped existing tables, in order to have a fresh start. Oh no. I have deleted the data from all the tables in the schema. I took a deep breath and thought: I made a mistake. Now what?

OMG! I deleted a database

Now what?

I talked to the other dev assigned to the project. He was supportive and we both searched for solutions. How to restore from a backup? Which client depended on this specific database? Can we restore the data?

It seemed the database didn’t have any kind of backup and we were unable to find out whose client owned this specific database. This gave us hope — it may have been a dummy database. Also, these databases only stored processed information from other systems, not transactional data. That means, in the worst case scenario, we could perform new extractions and fill up those client databases again. However, that would cause us to lose valuable time.

We needed to inform both the client and the team leader about the situation. Why? Because if data was missing, some other app that depended on it would crash or at least show empty reports. If that happened, the final client would contact our client to find out what was wrong. It would be far better for our client to hear about the problem from us first.

We informed our client by Slack, but he was offline. Then we informed our team leader. He listened carefully and was focused on what to do next to mitigate the impact. He made sure everyone who may be contacted by the client was aware of the situation.

When the client came online we found out it was only a staging database. By pure luck, I had managed to delete the only DB that did not have production data.

That was close.

Sh*t happens

People make mistakes. Your coworkers have made mistakes. You’ve made mistakes. You will again in the future. Mistakes don’t only happen in software development, they happen in every context. We can learn from what other professions do regarding human errors.

For example, there’s a safety mechanism developed in Japan called Pointing-and-Calling that reduces errors by up to 85 percent and cuts accidents by 30 percent, and it only requires you to point and say out loud information related to the task you are performing.

There’s another procedure implemented in nine hospitals in Michigan that within 18 months had saved 75 million dollars in healthcare expenses related to human errors. That awesome procedure is, believe it or not, just a checklist.

I had read about these methods in the past, but I didn’t apply that knowledge in my own profession. Knowledge is not power, implementation of knowledge is. After realizing the table I dropped wasn’t that important, I still had to finish the task, working with many other client databases.

Now, very consciously, I drew in my notebook a simple draft of everything I had to check on the DB, and the next steps to perform in a checklist fashion. After that, I told my partner developer to supervise me while I was finishing the task. I went step by step through the checklist, pointing and calling out loud what I saw and the action related.

In that way, the task was finished with no more mistakes, and we both developed a simple procedure to implement while working with production data.

Aftermath

A week later, I was still wondering about what would have happened if it had been indeed a client database, with real transactional data. In that case, we would have had a serious issue.

We need to learn from our mistakes and grow. I asked the client if they currently perform backups periodically and suggested to add them to their process if they hadn’t already. It turned out they did. He thanked me for my suggestion.

At our weekly meeting, I told the team about my experience, hoping they would learn from my mistake and prevent future errors.

TL;DR

If you have made a mistake

  • Accept it, embrace your responsibility and mitigate the damage
  • Implement changes so it’s harder to make the same mistake again and easier to make better choices
  • Learn from it and grow
  • Tell others about your experience to help them avoid the same mistake

Many thanks to Shannon Towle and Jordano Moscoso for proofreading and providing feedback.

Powered by Tekton Labs

--

--