What is Flyway?

Everything you need to know to get started with Flyway

Aston Technologies
11 min readDec 1, 2021

By: Kory Sericati, Sr. Software Developer at Aston Technologies

Just like you use Git to sync project code across environments, Flyway is a tool for versioning the state of your Java application’s SQL databases. You write your own SQL scripts defining what a database should look like and Flyway will make sure your application, wherever it gets deployed, is working with the most up-to-date version of that database. It’s easy to get Flyway up and running in under 5 minutes via the Flyway CLI, Java API, or Maven/Gradle plugins. All options are well documented and provide the same functionality, with the only differences being the syntax you’re comfortable with and when you want your database to be updated (manually with the CLI, or automatically during application deployment with the others).

When should you use Flyway?

If you’re unsure whether your project could benefit from Flyway, consider if the following apply:

❑ You have no need for separate development and production environments.

❑ You are the only person with access to your project’s database.

❑ You are confident you could get it all back up and running within a few minutes if all of your tables and procedures were dropped without warning.

If you checked all 3, versioning your current project’s database might be overkill. However, if you left any of those boxes unchecked, a few minutes of Flyway setup now could save you some database headaches down the road.

Whether your app is still in the planning phase or already deployed with multiple active database connections, adopting Flyway is as simple as adding a single dependency and defining your database connection. At the time of writing, Flyway supports 27 databases and chances are yours is one of them. Check their documentation to be sure.

Migrations

The fundamental unit of work handled by Flyway is called a migration, which is any collection of SQL statements you want to run together. Each migration lives in its own SQL file (or Java class if you’re using the Java API) and, based on how we name them, Flyway understands under what conditions and in which order they should be executed.

As shown in the screenshot above, most migration names have 4 parts:

  • Prefix: a single letter describing the migration type (Version, Undo, Repeat).
  • Version: number punctuated by underscore instead of period (1, 1_1, 1_2, …).
  • Separator: 2 underscores (__).
  • Description: what the migration does, written in snake case.

With this, we know that the first 5 example migrations beginning with V define how the database should change between versions 1, 1.1, 1.2, 2, and 2.1. But what about those last 2 files beginning with U and R?

Undo migrations are optional migrations that Flyway can use to roll back and recover from any version migrations that fail. If your database supports running DDL statements within a transaction, you probably don’t have a use for undo migrations since that rollback is already handled automatically. If you do decide to utilize undo migrations, always double-check that they’re idempotent (more on that later) since you can’t always be sure how far a versioned migration might get before failing.

Repeat migrations, as the name implies, are executed every time Flyway runs regardless of the current database version. Because of this, repeat migrations don’t include a version number in their name.

Database Version

When Flyway migrates your database, it checks the current version of your database, only applying your migration scripts that have a higher version number. This current database version number comes from a table called flyway_schema_history which Flyway will create if one doesn’t already exist.

The table is easy to read, with version, description, type, and script coming directly from your migration script names. We also get some auditing information in the form of installed_by and installed_on, as well as the time in milliseconds each migration took to execute and whether or not they succeeded with the execution_time and success columns. While these columns exist primarily to guide Flyway in applying future migrations, it’s worth pointing out that they also ensure you don’t modify past migrations.

This may seem like an obvious point but it’s an important one: the purpose of Flyway is to establish a consistent sequence of migrations that always results in the same database state. If you change a migration that’s already been applied, you’re changing that sequence which could result in a database being migrated to a “current” version that doesn’t match what you’re currently working with. This is what Flyway’s checksum column prevents. If a migration ever fails due to a mismatching checksum, it means that the migrations deployed with your application are not the same as those that created the active database and you should instead add a new migration that applies the difference.

Configuration: Spring Boot + Maven

The following configuration steps are shown using the Maven plugin, which is just one syntax for the same steps that apply to the CLI, Java API, Maven plugin, and Gradle plugin. Let’s say you have a simple Spring Boot project that uses Spring Data to interact with a MySQL database. To get Flyway up and running, all we need to do is add a dependency to the project’s pom.xml:

and tell Flyway what SQL connection(s) to work with:

This example uses an application.yml properties file, but these parameters could also be specified directly in the plugin’s pom.xml configuration if you prefer keeping it all in one place.

We could optionally also tell Flyway where we want it to check for migrations by specifying the location property. A common choice is to tell Flyway to scan the Java classpath for migrations by using the classpath prefix (the default location used by Flyway is “classpath:db/migration”).

The other option is to target a directory outside the Java classpath by using the filesystem prefix. You can specify multiple directories if you’d like, but only specifying the parent folder will do since Flyway recursively scans through nested directories.

That’s it! Assuming the given username and password exist on our target database with the appropriate privileges to execute our migrations, you can now count on Flyway to automatically apply your migrations whenever you deploy your application.

Advanced Configuration

The configuration above is all you need to get Flyway executing your SQL and making sure your application, wherever it gets deployed, is connected to the latest version of its database. But, if you’d like a little more control, additional configuration parameters are available. Of the roughly 80 parameters listed in Flyway’s configuration documentation, here are some of the more commonly useful ones:

baselineOnMigrate / baselineVersion

A database’s “baseline” version is the migration version it should be considered to already have applied by default (even if a migration has never been run). This can be useful when introducing Flyway to an existing non-empty production database, where you probably won’t want to run the first migration(s) that handle creating tables that are already in use. By setting the baselineOnMigrate configuration parameter to true, Flyway will consider any non-empty schema without a flyway_schema_history table to already be at the specified baselineVersion (default=1).

For example, if you have migrations V1, V1.1, V1.2, and V1.3 that handle setting up all the tables and data needed to get a database to the same state as your existing production database, you can set baselineOnMigrate to true and baselineVersion to 1.3. This way, when you deploy the application to prod and Flyway sees a non-empty database that doesn’t have a schema history table, it’ll consider the database to already be at version 1.3 and will only run migrations that have a version higher than that.

locations

While it’s most common that the locations configuration parameter will only specify the top-level directory containing all migrations to be run, it’s possible that you may need fine-grained control over exactly which directories are scanned. For this case, you can specify a comma-delimited list of directories as your location's configuration. A common case for specifying multiple locations is the use of a separate directory for test-related migrations. By adding that directory to your test environment’s locations, you can count on Flyway to handle seeding your test database during deployment.

It’s worth noting that the locations configuration parameter also supports specifying Amazon S3 and Google Cloud Storage buckets as migration script sources.

batch

Batch is a simple boolean parameter that, when set to true, enables the batching of up to 100 SQL statements at a time. Since you’re only sending 1 network request per 100 statements, this can theoretically save you up to 99% of your network overhead. This applies only to INSERT, UPDATE, DELETE, MERGE, and UPSERT statements, so it won’t speed up your DDL, but it can really save time on large migrations that handle moving and inserting a lot of data.

callbacks

Utilizing callbacks, you can specify scripts that run before or after different parts of the Flyway execution lifecycle (rather than as migrations). To implement this, all you need to do is have a SQL script with the same name as the desired callback and Flyway will run that script accordingly. One notably useful scenario is when specifying an additional “dev-migration” directory in your application-dev.yml. This dev migration directory could contain a script named afterMigrate.sql which will be executed after all the other migrations when deployed to the dev environment and could handle seeding the database with test data.

Check Flyway’s documentation for a full list of available callbacks.

cleanDisabled / cleanOnValidationError

The Flyway clean command is used to wipe the entire target database completely clean. It removes everything from tables to stored procedures. By setting cleanOnValidationError to true, you enable Flyway to automatically clean your database if it detects that an already applied migration’s checksum doesn’t match the migration file it sees. This can be convenient in a dev environment, allowing you to easily test and refine the same migrations without the hassle of having to manually drop everything before each commit.

Issuing a clean command on your production database, however, is generally not a great idea. As a safeguard, setting cleanDisabled to true in your production environment configuration ensures that Flyway will never clean your production database.

placeholders / placeholderPrefix / placeholderSuffix

Placeholders allow you to dynamically inject values into your migrations. By specifying key-value pairs under the placeholders configuration parameter, you enable Flyway to interpolate specific values wherever it finds their associated keys in your migrations. By default, the placeholder keys will only be interpolated to their associated values if they’re found in your migrations following the format ${inCurlyBracesWithDollarSign}. If that format doesn’t work for you, the prefix and suffix symbols can be configured with placeholderPrefix and placeholderSuffix.

outOfOrder

If you ever find yourself in a position where an additional migration needs to be added before the most recent migration that was applied in your production environment, the outOfOrder configuration parameter can help. Setting it to true will allow Flyway to execute any migrations that haven’t been applied, even if they have a version number that’s below the version of the current database. For example, if your production database has had migrations V1 and V3 applied, you could enable outOfOrder and commit migration V2 which Flyway will execute during the next deployment. I will note that this should be a last resort used to apply hotfixes to prod since updating a database out of order goes against what Flyway was made to enforce.

Tips

Flyway is a great tool that does exactly what you tell it to, but sometimes what we tell a piece of software to do isn’t what we really need from it. The following is a small collection of best practice tips to help you steer clear of common mistakes.

Always set cleanDisabled to true in production environments.

As mentioned in the advanced configuration section above, this guarantees that Flyway won’t drop your production data if you accidentally change your migration commit history.

Always write idempotent migrations.

Think of every migration as a one-way road to get a database from state A to state B. If you run the same migration a dozen times in a row, the first execution should get the database from state A to state B, but the subsequent 11 executions shouldn’t do anything since the database is already in state B.

The easiest way to do this is to surround statements in an IF condition. For example, if a migration is supposed to add a product table, it should only do that if a product table doesn’t already exist.

Some database platforms don’t allow IF ELSE logic directly in queries. To get around this, common practice is to wrap each migration’s logic in its own stored procedure. The basic pattern looks like this:

  • Drop procedure this_migrations_name if it exists.
  • Create a new procedure called this_migrations_name that’s defined as…
  • If the database is in state A, execute the migration logic that’ll bring it to state B.
  • Call procedure this_migrations_name.
  • Drop procedure this_migrations_name if it exists.

This way, even if something happens to the Flyway schema history or we accidentally execute a migration manually outside of Flyway, the end state of the database is always guaranteed to be what we’ve defined it to be.

Consider splitting up large migrations.

This is especially important if you need to define Undo migrations. Since you can never be sure exactly when a migration might fail, a migration that adds 2 columns to a table, replaces an index and moves a dozen rows of data has 15 potential fail states. A single SQL script that handles undoing 15 possible conditions is not a pretty sight. If your database doesn’t allow DDL transactions and you need to manually define rollback plans, keeping your versioned migrations as small and concise as possible will help ensure that migration failures are more manageable and easier to recover from.

For big projects, consider using timestamp versions.

If your project lives in an active repository with frequent commits coming in from different developers, it’s easy to run into situations where multiple commits contain migrations with the same version number. Instead of constantly renaming migrations and re-committing, using integer timestamps as version numbers allow you to avoid most conflicts. So, instead of making developer A and developer B decide who has to change their migration number between V4_17__redefine_index.sql and V4_17__drop_column_plant.sql, you could have migrations V202201041217__redefine_index.sql and V202201051014__drop_column_plant.sql.

Example Migrations

Putting everything together, here’s what migrations V1, V1.1, and V1.2 from the above migrations section might look like:

V1__my_initial_database.sql

V1__my_initial_database.sql

V1_1__add_new_column.sql

V1_1__add_new_column.sql

V1_2__remove_redundant_id_column.sql

V1_2__remove_redundant_id_column.sql

--

--

Aston Technologies

An IT pro services company with engineering teams specialized in enterprise networking, software services, security, data center, and UCC working nationwide.