Keeping Django database migrations backward compatible

Here at 3YOURMIND, we use Django Rest Framework for most of our backends. Django uses migration files to keep track of database changes.
When deploying an update to a server, those migrations are applied to the database.

We decided to keep those migrations backward compatible for the reasons:

  1. It enables us to do zero-downtime deployment (“blue-green” deployment)
  2. We can switch between branches during development without getting errors.
  3. Decoupling the code version from the database version is in general a good idea

For doing this, we developed 3 tools:

  1. For checking that all migrations are backward compatible:
    django-migration-linter
  2. For adding a new NOT NULL field:
    django-add-default-value
  3. For removing fields:
    django-deprecate-fields

This article explains our migration strategy and how we use those tools to solve both problems at the same time.

What are backward compatible migrations?

We call our migrations backward compatible when the migration changes the database in such a way, that an older code version still can use that database.

If you are a Django developer, you are probably familiar with some of those errors:

  • Field ‘myfield’ doesn't have a default value
    This happens, when the migration adds a column which is NOT NULL.
  • Unknown column ‘myapp_model.myfield` in ‘field list’
    This happens, when the migration renames or removes a column.
  • Data truncated for column ‘myfield’ at row 1
    This happens, when the migration alters the type of the column.
  • Table ‘myproject.myapp_model’ doesn't exist
    This happens, when the migration deletes a table.

Those errors are all caused by backward incompatible migrations.

Why should I keep my migrations backward compatible?

Because you can not have blue-green deployment otherwise:

If you have backward compatible migrations, the Old Server (blue) could still use the new database. If the New Server (green) is smoke-tested, you can make the switch and route your users to it without them noticing it.

As a nice side-effect, keeping the migrations backward compatible helps you switching branches during development. You could also achieve this by having one database per branch for this. However, personally, I prefer to have only one database across branches.

How can I keep my migrations backward compatible?

1. Lint your migrations

First of all, use the django-migration-linter which was developed by 3YOURMIND and David Wobrock. The linter detects backward incompatible migrations and can be integrated nicely into your CI pipeline or directly into tests.

We figured out, that it makes sense to keep the migrations backward compatible only most of the time. From time to time, we do a major version bump and also allow backward incompatible migrations (and a maintenance window). For this, the migration-linter has some possibilities to ignore migrations.

2. Add new fields

Quite often, you may want to add aNOT NULLfield to your model. If you want older versions of your code to be still able to insert new rows, you need to add a default value to your SQL-Scheme. If you do that, SQL will insert new rows, even if the field is missing. Unfortunately, Django does not set the SQL default value (because the default could be dynamic), so we need to do it manually with django-add-default-value by adding this to your migration:

AddDefaultValue(
model_name='my_model',
name='my_field',
value='my_default'
)

That would result in the following SQL:

-- Add to field my_field the default value my_default
ALTER TABLE `my_app_my_model`
ALTER COLUMN `my_field`
SET DEFAULT ‘my_default’;

The Migration Linter works well together with that: It will fail on new NOT NULLfields but will pass if you used AddDefaultValue().

3. Delete old fields

Deleting fields requires a 2-step procedure. You first need to make sure, that nobody is using the field before you can delete it.

To make that easier, we developed django-deprecate-fields.

Instead of just deleting the field from the model, mark the field as deprecated with like this:

myfield = deprecate_field(models.CharField(max_length=255))

Doing this, will keep the field visible for Django at migration time but hidden at runtime. Also, it adds a null=True to the field, to allow inserting new rows.

After you are sure that all servers use a code version with deprecate_field() wrapped around your field, you can then remove the field completely.

We do that cleanup step as part of a major release.

4. Rename fields

Avoid if possible. If you need to rename a field anyway, the backward compatible procedure would be:

  1. Create a new field
  2. Write a data migration
  3. Delete the old field using deprecate_field()

Conclusion

If you have an evolving Django application with a lot of migrations that needs a zero-downtime deployment, you can use the 3 tools presented above to ensure database compatibility between versions. You need to spend some extra work for running that strategy. And of course, you also need some infrastructure tooling to spin up the servers, migrate and make the switch, which is not covered in this article.

For us, it definitely pays out and we can highly recommend it!

Oh, and if you also like solving challenging problems, we are hiring :)