Flyway - Squashing Migrations

Flyway is a great tool for Databases Version Control, however, Flyway Migrations number tends to swell dramatically and harden the day-to-day work. Squashing Flyway Migrations is safe and easy - much recommended.

Chaya Berezin-Chaimson
AT&T Israel Tech Blog
7 min readNov 28, 2021

--

This blog assumes that you’re already familiar and working with Flyway. To those of you who aren’t, I recommend Flyway official documentation, it’s great and totally perfect to begin with.

I remember the first time I happened to hear about Flyway. I had moved to work on a new product and was part of a team that developed one of its micro-services from scratch. When designing this brand-new application and modeling its DB schema, one of my colleagues suggested using Flyway as our DB version control tool.

This random suggestion was mind-blowing for me.

Some legends tell that there was an era when apps were developed and maintained with no version control. It has been a long time since Bell labs introduced the first SCSS full version control tool, and nowadays none of us can think of writing a single line of application code without a version control tool.
But how about our databases? Databases play a crucial role in the life of a software application. From its initial definition to various version migrations, database changes are all managed with code (yes, even the tiny ‘ALTER’ command you just invoked straight on the environment database is code!). And just like application code, it should be managed.

The thought that there's a tool that does all the tedious and exhausting work of managing the DB code and versions, is just so simple yet so brilliant. I actually felt embarrassed, when I thought of how we worked on previous projects, where we wrote homemade utility tools to manage our DB migrations, or even worse, managed it manually, without considering that there may be a good tool that can do this for us seamlessly.

A few years have passed since that initial Flyway integration. During this time I’ve learned a lot about the tool and how to use it wisely to solve different DB versioning challenges.

The Great Inflation

In this blog, I’ll introduce the inflation of flyway migration scripts and how we can control and minimize them.

Anyone who has ever worked on a backend application with a DB schema is fully aware of the high frequency of DB-related changes, especially during the initial development phase, but also in between major versions.
When working with Flyway, each schema or data-related change is being translated into an additional migration script.
Let’s take as an example the following simple initial DB schema in a Demo project ‘University’, which contains Student, Courses, and StudentCourses tables:

‘University’ initial database scheme

Its creation code is our first Flyway migration script:

V1__create_initial_scheme.sql

During a month of further development, various changes might be applied to our schema: new tables and foreign keys(‘Teachers’ table), new columns to existing tables (‘name’ & ‘birth_date’ to ‘Students’ table, ‘teacher_id’ to ‘Teachers’ table), which make our DB schema look like this:

‘University’ developed DB scheme

This means our Flyway migration scripts will now include the following:

V1__create_initial_scheme.sql
V2__add_name_birth_columns.sql
V3__add_teachers_table.sql
V4__add_teacher_id_column_and_fk.sql

You can just imagine how many versioned scripts you’ll have when working on a large-scale DB for a few months.

Flyway works really well with a large number of migration scripts. It guarantees easy management of DB versions, regardless of how many there are. However, too many migration scripts can affect us in two ways:

  1. Maintainability
    Just like our application code, we want to have our DB-related code as clean and clear as possible.
    Tracking a specific table definition with its changes among a bunch of migration scripts can be confusing and time-consuming.
  2. Tests Performance
    A prior step of running application tests that includes DB data (either real DB in E2E tests or a test container DB in component/repository tests) is running all Flyway scripts from scratch to bring the DB to its latest clean state. The more migration scripts we have, the slower this setup step will run.

Flyway Solution - Baseline Migrations

Flyway is aware of the various changes on the database during the project lifetime, and the lengthy migration history this may lead to. Therefore, it offers the option to create a Baseline Migration.

A Baseline-Migration is a new type of Flyway migration, which lets us define a single, cumulative migration, where we’ll create the DB as it is after applying all the previous versioned migrations. This migration script will run only on new environments, so no existing environments will be disrupted.

Let’s add a Baseline Migration to our ‘University’ Demo, ‘B4__create_tables.sql’ which creates the DB in its final state, with Student, Courses, StudentCourses, and Teachers tables:

V1__create_initial_scheme.sql
V2__add_name_birth_columns.sql
V3__add_teachers_table.sql
V4__add_teacher_id_column_and_fk.sql
B4__create_tables.sql

When running ‘flyway:migrate’ on a new clean environment, migrations V1-V4 are ignored, while only ‘B4__create_tables.sql’ runs.
When running ‘flyway:migrate’ on an existing environment, in which V1-V4 has already been run, ‘B4__create_tables.sql’ is ignored.

The only problem with the Baseline-Migration solution is that it involves an additional cost. Baseline Migrations are only available in the Flyway Teams edition, which requires a paid subscription.

Baseline Migrations - Squash the Cheaper Way

For those of you who don’t have the Flyway Teams edition, I’d like to introduce you to a way of achieving the two goals of Maintainability and Test Performance. This is done by “squashing” Flyway migrations manually.

The flow of squashing the migrations is what I call the CDRB flow -
Create, Delete, Rename, Baseline.
Following this flow, you’ll be able to replace all your existing Flyway migration scripts with a single script and update your existing environments with the change.

Create - new baseline migration script:

The first step when squashing migrations is creating. This means defining and writing the single, cumulative migration script, that creates the DB as it is after applying all the previous versioned migrations, just like the Baseline Migration script. The responsibility of the content of such a cumulative script is on us. There are different tools we can use in order to export the existing DB status, i.e., the commands that create the DB schema, choose one of them, according to the database you work with.
This migration script is a simple Flyway version script, and its version will be the version of the first migration scripts that we squash.
In most cases, when we squash all the migration scripts, the version is V1.0.

Delete - previous versioned scripts:

Once we have the cumulative new version script ready, we can delete all the previous “squashed” migration scripts.

The first two steps (create & delete)leave our migration scripts folder clean with only one migration in it:

V1__create_initial_scheme_squashed.sql

At this point, new DBs, can be migrated as usual, i.e., Flyway migrate will run this single script, and the DB will be in its latest version. However, in our existing environment, the ‘flyway_schema_history’ table still holds the history of the previous scripts.
As we would like to create a new history of Flyway and a new versions order, ‘flyway_schema_history’ must also be updated. To solve this, we can use the two next steps.

Rename - flyway_scheme_history table:

What we really want to do is to delete or truncate the ‘flyway_schema_history’ table in the existing environment and create a new one. However, it is recommended to keep the previous history of Flyway, and just rename ‘flyway_schema_history’ to a different name (better a meaningful one that contains the renaming date). Once the table is renamed, Flyway identifies that the table no longer exists and will create a new empty ‘flyway_schema_history’ table to contain the new version history.

Baseline - existing environments:

As we have a clean ‘flyway_schema_history’, and a single version script containing all the commands for our current DB creation, now we have to baseline the existing environment to our new latest version:

./mvnw -nsu flyway:baseline \
-Dflyway.user=${DB_USER} \
-Dflyway.password=${DB_USER_PASSWORD} \
-Dflyway.url="jdbc:mysql://localhost:${HOST_PORT}/${DB_NAME}" \
-Dflyway.baselineVersion=1.1

The baseline command tells Flyway that this specific environment is baselined on the specific baseline version, meaning only migrations with a version above the baseline will run during ‘flyway:migrate’.

Please note that the last two steps of renaming ‘flyway_schema_history’ table, and baselining the DB, should be applied to all of your existing environments, in which Flyway has already been migrated to the last version.

Following the CDRB process, we end up with:

  • A single ‘squashed’ migration script ( V1.0), in place of previous X migrations
  • Existing environments baselined with the new current version (V1.0)
  • Setting up a new DB environment with Flyway is quick and the code is maintainable
  • Additional DB-related changes will be written as new Flyway migration scripts, beginning at V1.1 and above.

Squashing is done, yet, development work never ends, and neither do new DB-related changes. New Flyway migration scripts will be written continuously, inflating the database again. We should make squashing part of our development routine.

Call for Action

I hope you found this blog useful for your daily work with Flyway. Don’t hesitate to squash your Flyway migrations, it’s safe and can be done for free. Remember that “the best things come in small packages”.

--

--