How flyway ruined us

Kevin Malone, The Office

Once upon a time, there was a company which must not be named, who needed to execute SQL scripts for data manipulation and data definition. Not the first one with that need nor the last one. They had a few databases for test environments and one for production. In their development cycle, they would manually run these scripts against the environments where they were shipping the code. For production purposes, they would gather them all together and let a system administrator run it for them.

This worked for a while. However, people still forgot scripts here and there, or submitted an outdated script. As the team grew, more scripts were produced, and the time for automating the executions came, finally. They came up with a do/undo structure. A structure where all devs committed to a single sql file (did someone say conflicts?) that would run with every deployment. Thinking that would keep it current, valid and working throughout time. Also, nothing will be left behind this way.

There would be a system that would execute the do file and if that failed, the undo file. And if that failed… they would need to call a developer with DBA assitance on the middle of the night to fix it, because is very likely the changes were partially applied.

I hope the story is not familiar, yet.

A deployment would break if just one of the scripts had a syntax error. It was all or nothing. Devs couldn’t solve svn conflicts and would step on each others scripts. Scripts were still lost. The problem escalated. People were frustrated. It was the wrong approach, they thought.

So, this one dev thought of setting up flyway. It was modern, had versioning, supported SQL. After all, Dave Syer recommended it.

Database migrations are something that Java developers struggle with, and Flyway provides a nice tool that anyone with basic knowledge of SQL can use. For that reason it has become the favourite migration tool in the Spring Boot team — Dave Syer

The team had little knowledge about it, but it wouldn’t matter since the documentation was available. Internal documentation was even released as well. A job was written to handle versioning to give developers ease of life, and would be transparent for them trying to avoid mistakes. They tested it, it worked. One file at the time, no more conflicts. If something broke, it shouldn’t be a blocker for anything else.

One way ticket to hell. There were problems nonetheless. When flyway reported failures, people manually executed scripts to resolve issues. Modifying scripts, and sometimes script removals, occurred after execution. Scripts sometimes ran correctly in some environments but not in all. Suddenly, that one dev got stuck cleaning after everyone. They would claim, because they didn’t understand flyway, that their scripts broke due to it.

The constant denial, the never owning your mistakes and the too little coordination led to this; not flyway (spoiler alert!). All this time, they thought they were never the problem. And truth to be told, they always were. Any approach would have worked with the correct practices.

After recognizing it, probably they would start getting better: know how to fix a conflict, test your script before sending it out, go through proper code review process, read documentation beforehand and why not, have basic knowledge of SQL like Dave Syer said.

“And this pain, as much as we hate it, is useful. Pain is what teaches us what to pay attention”
Excerpt From: Mark Manson. “The Subtle Art of Not Giving a F*ck.”

Has this happened to you? What would you tell to that developer?