“Migrations zero” or how to handle migrations on a large Django project

Xavier Dubuc
7 min readOct 16, 2018

--

(Update March 9th 2024 : someone has built some python package to ease the integration of this concept in CD/CI environment. I didn’t test it but it sure worths a look : https://pypi.org/project/django-migration-zero)

Working with Django is paired with working with migrations. Migrations are operations needed to be run on a database so that its state stays coherent with the code state. To maintain these migrations over time can be tricky and their execution can take a lot of time. “Migrations zero” is a solution to this. In this article, I’ll explain what it is, why it is needed and how it can be achieved.

What is the problem with migrations in large projects ?

First, let’s see how migrations can be such a pain. It resides in one sentence :

“With large project comes big amount of migrations”.

Also, there can be dependencies between migrations that Django needs to be aware of those dependencies in order to run Migration A before Migration B if B depends on A. To be able to do this, Django build a tree, the migrations tree, and the time needed to build it depends on the amount of migrations and dependencies between them. As i’m writing this, I’m working on a project who used to have ± 580 migrations and running migrate or makemigrations took like ± 20 minutes.

You see it coming, every time you run tests, every time you create a new migration, you loose 20 minutes and this is the problem.

What is migrations zero ?

Migrations zero is the name given to a set of migrations that is needed and sufficient to reflect the database as it has to be (without any data, and without anything to migrate from). In that way of thinking, all other migrations depict the needed operations to transform the PRODUCTION database state to DEVELOPMENT database state. These migrations can contain data migration, new table creation, etc.

Migrations zero is by definition a subset of (or equal to) the set of all existing migrations

The challenges of migrations zero pattern are:

  • build the migrations zero set the first time
  • keep the migrations zero set coherent and minimal

To achieve these challenges we need to be able to rebuild this migrations zero set easily and as frequently as possible so that the migrations tree does not grow too much. The ideal would be to rebuild the set as soon as the PRODUCTION database is updated but it will depend on your deployment policy. (If you use an automated deployment system and you tend to use the migrations zero pattern, there are some concerns to think about, I’ll talk about it in a future article)

Be careful : the building of migrations zero set will require to delete all migrations. You need to be sure that those migrations have already been executed on your PRODUCTION database and that they are no more required.

How to build migrations zero set ?

Now we know what it is and why it’s needed, it’s time to introduce the methodology used to build the migrations zero set. This section will give you all the keys you need to achieve it.

Different types of migrations

Before getting into details, let’s begin with some definitions. Each migration differs by the operations it runs. There are mainly 3 types of migrations :

  • Structural migration : these migrations only contains table & relation modifications generally automatically generated by Django.
  • Data migration : these migrations contains at least one RunPython operation and transfer data from a model to another or creates data based on other data; in other words, they need logic to create data.
  • Fixture migration : these migrations contains at least one RunPython operation and load data in database to allow the application to run, for example a list or countries, a list of user statuses … They don’t need logic to create data.

(They also can be a mix of two or all types above)

Building process

For that section, I’ve been inspired by this article of Vitor Freitas.

The building process is mainly composed of 2 steps :

  1. clear the actual migrations so that the application is not harmed
  2. generate the initial migration state

The second is quite simple but the first can be tricky as multiple things can happen and it will depend on the content of your migrations.

Clear the actual migrations

First, you have to be certain that your models state are strictly identical to the database state. To be sure of that, run python manage.py makemigrations.

  • If the output is No changes detected then you can proceed
  • Otherwise, it will generates some migrations that need to be executed before the clearing. Be careful as at this point, these migrations may be also needed for other environments (PRODUCTION for example !) so it may be interesting to commit these migrations before going forward.

Now it’s time to delete all migrations. The proper way would be to fake the unapplication of all migrations for each app but it can lead you in a deadlock of dependencies. You can try that way first and if you’re running into the deadlock, try the hard way.

The proper way : for each app app of the project, run

python manage.py migrate --fake app zero

The hard way : Clear the django_migrations table.

Now, the next step is to delete all migrations files BUT before that, you’ll want to check every migration to determine to which kind it belongs.

  • If it’s a structural migration, you can consider it as safe to delete (its operations will be recreated in the second step)
  • It it’s a data migration, you can normally consider it as safe to delete without any further consideration (they only have to be executed once then they are useless). Be sure, by the way, that it does not contain some part of a “fixture migration”.
  • If it’s a fixture migration, you need to take care of it.

As said before, fixture migrations contain data to load in the database so that the application can work. Deleting these migrations could be a problem as if you want to build a new system from scratch, you won’t be able to run the application. The solution to take care of these is to create a fixture to load these data. (I could write an article about that) Once the fixture is created, your data migration can be considered as safe to delete.

As soon as you checked all the migrations and handled them correctly, you can go ahead and remove all the migration files.

Generate migrations zero set

Now that you deleted all migrations, you can proceed to step 2 : generate the migrations zero set.

To achieve it, you simply need to run

 python manage.py makemigrations 

It will generate for you the migrations zero set. At this time migrations zero set is generated and you work is done. However, you still need to do something in order to be able to continue developing with migration. Indeed, you need to tell your database that all migrations from migrations zero is already executed.

For that just fake the migrations :

python manage.py migrate --fake 

You can then run all the tests to be sure nothing has been broken. If all the tests passed, you can commit and push the new migrations zero set.

How to work with migrations zero set ?

Each time the migrations zero set is generated, on PRODUCTION (or even for each developer except the one responsible of the generation), it will be mandatory to handle the reception of the new set. For that, normally, you just have to clean the django_migrations table (not mandatory) and fake the migration zero set by running python manage.py migrate --fake.

But in some case, you could face a undesired situation : you just generated a migration based on the old state and someone just push the migration zero state. In that case, you cannot push your commits as your migration won’t be executable. That is something you really want to avoid, so it should be communicated to everyone that a given commit is rebuilding the migration zero set. But, if it happens anyway, follows the instructions below.

  1. If your generated migration is structural one:
  • delete it
  • run python manage.py migrate --fake to base your database on the new migrations zero set
  • run python manage.py makemigrations to recreate your migration
  • run python manage.py migrate if this migration has not been executed yet, add --fake option otherwise

2. If it’s a data migration:

  • delete the migration
  • copy the migration content (the method executed by the RunPython block)
  • run python manage.py migrate --fake to base your database on the new migrations zero set
  • run python manage.py makemigrations *app* --empty to recreate your migration
  • paste the previously copied content into the newly generated migration
  • run python manage.py migrate if this migration has not been executed yet, add --fake option otherwise

3. If it’s a fixture migration:

  • create the associated fixture
  • delete the migration
  • run python manage.py migrate --fake to base your database on the new migrations zero set
  • Be also sure to load the created fixture.

And that’s it, you are now a Migration zero professional !

--

--

Xavier Dubuc

Lead Developer @BHC in Belgium. Love python, Odoo and project managing.