Database Schema Maintenance with Fluent Migrator
In this article:
- Using Fluent Migrator to maintain database schemas
- Writing and applying database schema migrations
- A Database class to keep things simple
- Using up-only migrations to simplify migrations
In a previous article, we created a dev container (see article “Building a Dev Container for .NET Core”). Then we added a second container to run a database server we can use during development (see article “Dockerized SQL Server for Development”). This also served as an example how we can use pre-built container images to make our lives easier as developers.
At this point we have a working development environment but no code. In general, we can use this setup cross-platform, i.e. on Linux, Windows, MacOS, without any change required.
We are ready to introduce the next building block: creating and managing a database schema. This is the main focus of this article. We will use a simple console application to learn how to do this.
In this article we’ll be using as a running example a fictitious product named “Mahi”. The word Mahi means “task” in Te Reo Māori, the language spoken by the native people of Aotearoa, the country also known as New Zealand.
Mahi is a very simple task manager. We won’t have a commercially viable product at the end. But we will learn new concepts as we work through new features. Keep in mind, that the code base is not meant for production. You are welcome to use it as inspiration for your own work, commercial or otherwise. The responsibility is entirely yours, though, if you do.
The complete source code for this article is available at https://github.com/mahi-app/CmdLine/tree/article-2020-11-15.
To follow the steps in this article you’ll need:
- Current version of a git client
- Recent stable version of VS Code
- Docker Desktop (Windows, MacOS) or Docker Engine (Linux)
- VS Code Extension Pack “Remote Development” (identifier: ms-vscode-remote.vscode-remote-extensionpack)
Either follow the instructions in the previous articles (see “References and Additional Material” at the end of this article) or just grab the starting source code from https://github.com/mahi-app/CmdLine/tree/article-2020-11-01.
For this article we’ll be using a relational database system (RDMS) to store data. The database server, SQL Server in this instance, is running in the second container.
To store data, the database needs to know what structure to use. This structure is called a schema. It contains information about the tables, columns, indices, relationships, constraints and similar more.
There are different options for creating and maintaining database schemas. I prefer an approach where I can ignore specialized languages for the most part. Instead I want to write as much code as possible in my preferred language, in this case C#.
Fluent Migrator (https://fluentmigrator.github.io/articles/intro.html) provides a simple, yet efficient way to do describe and apply database schema changes, also known as migrations. We don’t need to learn the Data Definition Language (DDL) that comes with the database. Fluent Migrator takes care of the translation from C# into the DDL dialect specific to the selected database server, i.e. SQL Server in our case. Note that the DDL for different RDMS’s can have small variations, e.g. SQL Server’s DDL may be slightly different to Oracle’s.
Occasionally different people working on the same code base but on different branches need to apply different changes to the same schema. Some tools make this hard because they store the schema state when a migration is applied as well. Then, at merge time, this can lead to problems that can be quite time consuming to resolve.
Entity Framework (EF) is one such tool that stores state when a migration is applied. In my experience, this works up to a point. Based on my work with a large number of clients, though, it appears that even experienced teams eventually run into complex issues, e.g. when branching.
Fluent Migrator (FM) avoids this by not storing the state of the schema in the database. Instead each change gets a unique identifier and Fluent Migrator then just records that identifier. This mechanism allows Fluent Migrator to keep track of what changes have already been applied to a given database. As a result, merge conflicts are rare. The identifiers are also used to determine the sequence in which the migrations are applied.
What is a Migration?
When you create a new database on the database server it has no schema (apart perhaps from some schema created by the database server itself). Adding or modifying tables etc. represent changes to the schema. Each such change is called a migration. It is not uncommon to see a database undergo dozens if not hundreds of migrations over time.
Ideally each migration is as small as is reasonable, or “atomic”. For example, it might create just one table and the indexes the table needs. If you need another table or another column on an existing table, just create another migration. Once a migration, i.e. its code, has been pushed to the code repository, it will be deemed immutable.
There are exceptions to the rule to keep migrations small, e.g. if you have a group of changes that make more sense to be applied as one unit. Fluent Migrator allows you to treat each migration as a single transaction, provided the target database supports transactions for executing DDL. For example, as of writing SQL Server 2019 supports this. Check the documentation of your database server in case you use one other than SQL Server 2019.
A migration for Fluent Migrator is simply a C# class. It derives from an abstract base class that requires you to implement to methods: Up() and Down(). More about this later. Suffice to say, these two methods contain very simple code. At the end of this article we’ll get to a point where need to implement only one method per migration.
The CmdLine Project
Before we can start our work in earnest, we need a project. Open the repository — see “Prequisites” above — in the dev container, then open a terminal. This will be bash. Enter the command “pwd” and you should see out similar to the following (the id after “mahi@” will be different in your case):
“pwd” is a command that prints the current working directory. Within “/app” enter the command
to create new folder. We will use the directory “/app/src” for the source code. Then use
to switch to the newly created directory. Inside “/app/src” create a console project with the following command:
dotnet new console -n CmdLine
This should give you this output:
Any nuget packages on which the console application may depend are restored as well. The new directory and file layout should look similar to the following:
To be on the safe side, let’s see if we can execute the console app. In the terminal window switch to directory “/app/src/cmdline”, then execute the command:
Dotnet will build the project, then execute the program. The output should be a simple “Hello World!”. Congratulations! Well done!
Add a .gitignore File
Generally, now would be a good time to commit the changes. However, before you do so we need to add a “.gitignore” file. We don’t want to track any of the intermediate or output files.
We could create the “.gitignore” file manually. There is a better way, though. We will use the web site https://gitignore.io to generate the file content. On that site, enter the term “dotnetcore”, hit the “Enter” key so it turns the input into a label and then click the “Create” button. The website will then display the content we need. Copy that content into a new file at the root of the repository:
With the “.gitignore” file in place, we should now commit. Frequently committing and pushing speeds up the flow of code changes in your team.
Adding Fluent Migrator to Project
Fluent Migrator is packaged up into several nuget packages. We need just “FluentMigrator.Runner”. This package will pull in everything else we need to write and execute migrations.
To install this package switch to directory “/app/src/cmdline” in the terminal window, then execute the following command:
dotnet add package FluentMigrator.Runner
Our project file “CmdLine.csproj” now looks as follows:
Again, execute the command
to confirm we haven’t broken anything just yet. If you are still greeted with “Hello World!”, that’s great. Also, this is another opportunity to commit our code changes. From now on, I won’t mention this anymore. Just make sure you commit frequently to avoid losing work. There is nothing wrong with committing 10, 20 or even more times each day.
When to Separate Migrations from the Runner
We don’t have to have both, the migrations and the code that invokes the runner, in the same project. There are cases where separating them makes sense.
If you want to separate them for your own work, then the other nuget package of interest is “FluentMigrator.Extensions.SqlServer”, if you use SQL Server. The latter package would be used in the project that contains the migration.
The project invoking the runner would reference the package “FluentMigrator.Runner”. In this article we’ll keep things simple, though, and therefore install “FluentMigrator.Runner” only. We will have both, the execution of migrations as well as the migrations themselves in the same project.
The First Migration
We are going to add a few files in this article and there will be more in future articles. I found that it pays to spend a couple of thoughts on an appropriate directory structure. A fit-for-use structure makes it easier to conceptualize and to locate files when needed.
Architecturally, I found that in practice most of the time we can get away with three layers:
- Presentation Layer (this is not necessarily the User Interface)
- Business Logic or Domain Layer
- Data Access Layer
The presentation layer could be the user interface or it could be where API endpoints are handled. For example, if we are implementing a microservice we may choose to use this layer to deal with different representations of incoming or outgoing data, e.g. C# to JSON or JSON to C#.
For the Business Layer, I often use service classes. I keep the code minimal in the presentation layer, and hand off flow to the business logic layer as quickly as possible. Classes in the business layer are relatively easy to test as usually we are dealing with Plain Old C# Objects (POCOs) only.
Typically for each layer I create one directory at the root level. For instance, to represent the data access layer, I typically create a directory “DataAccess” at the root. We’ll introduce the directory structure for other layers in future articles.
Within directory “DataAccess” I typically have another directory called “Migrations”. We’ll use this for all the migration classes we need.
Creating the List Table
As you will recall from the running example, Mahi is an app for managing todo lists. Therefore, it is a reasonable choice to create the table that will hold information about lists first. In future articles we will change from migration-first to a domain class first approach.
Make sure you create the folders “DataAccess” and “Migrations” to get the following structure for the repo:
In this file, we write the following code:
Let’s have a closer look at this code. In line 5 we use the attribute “Migration” which expects one parameter of type “long”. This is the identifier of the migration. Also, Fluent Migrator sorts all migrations by ascending order of this number.
To make it easier I make this number to represent the current date and time. By grouping the digits as shown in the code, we highlight the date portion and the time portion, even though the result is still a long. The underscore to separate digits, i.e. the “digit separators”, was added in C# 7 (see https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-7#numeric-literal-syntax-improvements) to improve readability.
I used the date and time of when I wrote this piece of code (09 Nov 2020 at 0935 using 24 hours format). This is a very simple mechanism to generate unique ids which also works in team settings. It is extremely unlikely that two different developers working in the same repository at the same time will create a migration at the exact same point in time. So, whenever you write a migration just look at the time of your computer and you know what number to use.
In line 6 we see that we derive from base class Migration. Note that this class is different to the attribute of the same name. The attribute’s full class name is “MigrationAttribute”. It’s just syntactic sugar that you can omit the suffix.
The base class Migration is abstract and requires us to implement two methods: “Up()” and “Down()”. We’ll look into the implementation of “Down()” later. Here we implement “Up()” only.
Lines 15 to 18 create the table named “List” with two columns. In line 16 we say that we want a column of type Guid that is not nullable and that will serve as the primary key on the table. The type “Guid” doesn’t exist in SQL Server under that name. This is where FluentMigrator translates it to “uniqueidentifier” which is the name SQL Server uses for this data type.
In Line 17 we specify a second column to store the name of the list. This particular table probably cannot get much simpler than this.
You can build and run the project again with this change but nothing will have changed in the behavior. We still need to have the FluentMigrator runner execute our migration.
The Database Class
This is where things are getting a more interesting. We’ll add quite a bit more code in this next step. Keep in mind, though, that the code introduced here is more of the infrastructure kind. We’ll write it once and generally won’t have to change it again.
For code that deals with migrations (this article) and object-relational mapping (ORM, future article), I typically create a class named Database. Once in place I rarely need to change it, perhaps once every few months at best. Let’s create a file “Database.cs” in folder “src/CmdLine/DataAccess” with the following content:
Don’t be scared. This may look like a lot, but it’s actually quite simple. Let’s walk through the code line by line.
First, notice that “RunMigrations()” in line 9 is the only public method. All other methods are private. We’ll make use of this public method later.
Fluent NHibernate makes use of the dependency injection infrastructure of .NET Core. In line 11 we call a method that creates all services that we need and that returns a service provider.
In line 15 we set up the scope for the service provider. In line 17 we pass the service provider to the private method “RunMigrations(IServiceProvider)”. And that is all there is to the public parameterless method “RunMigrations()”. Quite simple at this abstraction level. Let’s dig a little deeper.
Lines 24 to 40 create a ServiceCollection object. All services that we want to use need to be registered with this service collection. Line 28 uses Fluent Migrator’s “AddFluentMigratorCore()”to register FluentMigrator itself.
Lines 29 to 35 configure the migration runner. In line 31 we add support for SQL Server. In line 33 we specify what connection string to use. And in line 35 we tell the runner where to look for migrations.
Line 37 adds logging. In this case we add Fluent Migrator’s console, so that we can see if and what migrations are applied when the migration runner executes.
Finally, in line 39 we create the service provider from the ServiceCollection object. Obviously, this new service provider will have access to the service collection. Any code with access to the service provider can ask it for instances of a service.
Let’s have a look at the final piece of code in this class, the private method “RunMigrations(IServiceProvider)” in lines 45 to 52.
In line 48 we use the service provider to obtain an object that implements IMigrationRunner. An object that implements this interface supports running migration. This is what we do in line 51, where we invoke “runner.MigrateUp()”.
And for now that’s all the code in class Database. One further comment on line 54, though. In this example the connection string is hard-coded. For any practical purposes we can’t use a hard-coded string. We will resolve this issue in a future article. The basic idea, though, will be to read it as a configuration value and then pass it as a parameter to class Database.
Using the Database Class
With the Database class in place, we can now make use of it in our console application. We change the content of Program.cs to the following:
In line 2 we add the using statement for the namespace that contains the Database class. In line 10 we invoke Database.RunMigrations() to apply the migrations.
As you can see, we were able to encapsulate the more complex code in class Database. In class Program where we make use of it, the code becomes very simple, effectively a single line of code. The details about how to configure the Fluent Migrator runner or how to execute migrations are encapsulated in class Database.
Apply the Migrations
Let’s see if this is actually doing what we expected. In the terminal window make sure you are in directory “/app/src/cmdline”, then execute the command
You should see output similar to the following:
The highlighted output claims that Fluent Migrator applied our migration to the database. Let’s check. Use your preferred tool to connect to the database. The parameters on your development machine are:
- Server: localhost,7610
- Database: mahi-cmdline
- User: sa
- Password: PassWord42
With that you should see something similar to the following:
We can see the table List with the two columns, “Id” and “Name”, as well as the entries under “Keys” and “Indexes”. This means the migrations have been applied.
The VersionInfo Table
If you take a closer look, you will notice that the database also contains a table named “VersionInfo”. This is where Fluent Migrator keeps a record of which migrations have already been applied. You should see something like this when you look into that table:
This confirms that all worked as we wanted and that Fluent Migrator has recorded the migration, i.e. the schema change.
If we run “dotnet run” again, then we’ll notice that all output regarding migrations is absent. Fluent Migrator won’t apply our migration a second time since it is already in place.
From here, the workflow is essentially pretty simple. If you need a schema change, write a new migration. Run your program and the schema change will be applied.
If you follow this approach from day one, you can always delete the entire schema and Fluent Migrator will be happy to reapply all changes, thus recreating the full schema. This makes it quite convenient to experiment with changes, in particular if you keep changes small and apply them incrementally.
One word of caution, though: Once a migration has been committed and pushed, it becomes immutable. Do not change an existing migration. If you don’t like what a migration does, just write a new one that reverts what the older migration did. There is no problem to have dozens or even hundreds of migrations.
The Only Way is Up: The UpOnlyMigration
You will remember that the abstract base class Migration required us to implement a “Down()” method as well. At the moment the code in method “Down()” just throws a “NotImplementedException”.
It is debatable whether the “Down()” method is really needed. I think it depends on your preferred workflow. If a migration has already been committed and pushed, in general there are no reasons to ever use “Down()”. In your development environment, you can always just remove all tables and rerun the migrations. I haven’t implemented, let alone used method “Down()” in many years. It never was a problem. In some extreme cases I just deleted the database and started from scratch. With the migrations in place, it was easy to recreate the schema. With integration tests creating the data they need, there was no need to restore pre-populated databases. I am aware that there are cases, where you need a pre-populated database. There are ways to deal with these. That’s for a different article at another time.
Having said that, I don’t want to rule out, though, that perhaps in your particular scenario, you may need going back one or more migrations. And that is fine.
Here I’d like us to make one more change that simplifies the implementation of our migrations that don’t need a “Down()” method. Let’s implement a class UpOnlyMigration. Create a file in directory “/app/src/cmdline/DataAccess/Migrations” with the following content:
A couple of comments on this class. The class is still abstract as it doesn’t implement method “Up()”. Line 20 declares it as abstract. In line 15 we see that the method “Down()” is sealed. Derived classes cannot accidentally override this method. And in line 14 we add the SuppressMessage attribute to avoid the compiler or C# language server telling us that this method is not implemented.
Using the new class UpOnlyMigration as a base class we can change the implementation of our only migration class “M201109_0935_AddTable_List” to the following:
As mentioned earlier, we can now write migrations by implementing a single method. Cool!
We now have a basic understanding of how to use Fluent Migrator to maintain a database schema. Admittedly, so far, we have created only a single table. In one of the next articles we’ll introduce additional tables as needed. We’ll also introduce indexes and other useful concepts. It will become even more interesting once we bring the object-relational mapper (ORM) into the mix.
Mixing Schema Migrations and Data Maintenance
Keep in mind that migrations look after the schema in the database. They are not meant to do anything with the data itself. If you choose to also “migrate” data, e.g. insert rows, then keep in mind that perhaps you are mixing two different things. For each change, consider which of the two, Fluent Migrator or the ORM, is the better tool.
Generally, I’d advise to clearly separate between maintaining the structure of the database on one hand and its content on the other hand. Mixing concepts usually leads to pain.
Testing migrations is a tricky question but definitely worth a thought. In particular if the migration will modify a table that has a large number of rows, you want to run some tests. This can potentially be extremely time consuming. As long as tables are small or even empty, all migrations are fast. If tables are large and also subject to the nature of the change, migrations can be very time consuming.
When you have a table with a large number of rows, I’d recommend performance test your migration before you push it out to production. Test with different, increasingly larger datasets until you have confirmed that the migration’s performance meets requirements. If it remains slow, consider replacing the slow migration with several smaller migrations where each is fast enough.
I won’t touch on this in more details in this article. However, I wanted to make you aware that nothing comes for free. The convenience of Fluent Migrator doesn’t mean we can afford to ignore completely the underlying technical limitations in terms of speed for some types of changes.
In general, though, Fluent Migrator is a tool that is easy to use and does a reliable job maintaining your database schemas.
Thank you for reading!
References and Additional Material
As always, some references and suggestions for additional material that you might find helpful for further exploration: