DIY — A database schema migration framework
A structured database like MySQL will not let you add or remove columns/tables on the fly. You need to run a query to first create a table or modify existing one. So if you don’t use a migration framework, you need to run these queries on the database before you push a new release. This has an overhead of managing all those queries and make sure they are executed across different environments. All the developers have to make sure that they have run the required migration whenever they pull code.
To solve this hassle we need database migration frameworks which smoothly keeps the database up to date. There are many libraries and frameworks that help you achieve this. But let’s try to build one on our own. It’s fairly simply.
Requirements
Let’s start by listing our requirements:
- Human intervention should not be required to manage the schema, whenever new code is released, schema should get updated automatically.
- The idea should not be restricted to a single database. We should be able to apply these principles to any database using any language.
- We should be able to rollback the schema if required
Based on the above requirements, we can brainstorm on more technical requirements/or design. We need to maintain the version of the database to go back and forth. Every migration script has a reverse-migration script that will take back whatever you have done.
Once we maintain this data, we just have to move ahead or move back depending on the requirement and invoke the relevant methods.
Implementation
We need to store the database schema version in the database. While this can be just a single version number/string, it’s a good idea to also maintain some more information along with this like a comment or the name of migration and time at which it was run.
A sample table might look something like this:
migrations
- version: long
- dateRan: long
- comment: string
- writtenBy: string
- rollbackPossible: boolean
- more information you need
When an application starts, it can check the last timestamp/version of the migration and run all the migration sequentially.
About the migration
A simple migration might follow a an interface which has two methods — migrate (move ahead) or rollback (move back)
public interface Migration {
void migrate(); void rollback();
}
The migrate method will have code to perform the migration. This can be as simple as DDL query or a custom script that do lot more than just creating and modifying tables. Rollback will undo what migrate
has done. Rollback might not be required if you are dealing with database you can reverse the DDL scripts, but we should keep it for more complex migrations with custom scripts. It also helps us keep this generic.
Our application on startup can search for all implementations of this interface. Almost all modern programming languages have a way to reflect into the running code. We could use an annotation and search for that annotation in the code. This annotation can have additional information like name of the migration, description, timestamp to run on etc. Or we can make it part of the interface as well.
Endpoints or callback to move back
In a web application we can expose an endpoint that takes the database to a previous (or future version). In a regular application we have to rely on a language specific feature or an environment variable to communicate with the application.
Can you migrate the migration table itself?
Sure. Why not? Once this is ready, we can migrate any table.
Summary
- Maintain the version in the database itself
- Use a interface to create migration, annotate them with additional information
- Startup code that will find the migrations we need to run and run them
Let me know if this satisfies your needs. Let me also know if you have more questions about your particular use case.