What we learn from handling database using Django

Published in

HARA Engineering

5 min readAug 9, 2019

Introduction

During the initial development of HARA, plenty of requirements flooded in that must be delivered immediately while we had limited resources. In the rush to deliver new features, we experienced some difficulties related to django migrations that caused our backend services to not work properly.

In this article, we would like to share our experiences handling migration trouble in our environment development.

We are now using Django Rest Framework to provide endpoints consumed by HARA Agent Apps and other related dashboards. Django provides many features essential for scaling up, such as security, speed, and simplicity. Functions like Web Browsable API, Authentication and Serialization support various data sources that help us develop easily. Having the framework written in Python also helps greatly, as there is a bunch of library and community discussions surrounding it.

We are using Django Rest Framework version 1.11, postgreSQL as DBMS, developing under docker and deploying as a serverless using Zappa. Zappa quickly deploys the Django project on AWS Lambda and AWS API Gateway. Zappa uses S3 to upload source code then wrap it up to be deployed into AWS Lambda and AWS API Gateway.

Django migrations

Django has a mechanism to handle changes in the database by detecting changes made in models called migration. Once a model is changed, to run a migration, enter python manage.py makemigrations in the command prompt. This script will detect changes and create a migration file. Next, after migration is made, run python manage.py migrate to apply the changes into database.

This is an example model for organization to be applied. Consist of creating two table organizationand organization_organizationtype with field listed as below :

From snippet above django detected changes then generates file 0001_initial.py, that belongs to organization apps as shown below :

Migration file contains some operation to be done in database. That migration file also can be updated manually in syntax as needed. Migration that already applied are recorded in django_migrations table as shown below :

Currently there are three environmental stages on deploying our services consist of development, staging and production.

Sometimes we need to be change either migration file or table structure manually to fixing some error that happen in particular environment development. These errors are:

First: most errors that occur during migration are due to differences of migration state between environment stages. This error happens when we skip a deploy process in a certain environment stages. For example, say we deploy in development, staging for two release that contain model alterations. Then, we jump deploy to production the last release — there are some migration that need to be run sequentially got stacked up causing the migration not working properly.

Second: We ignored our migration files (we do not push the migration files to repo). Instead, we let the migration generated and executed on deployment pipeline environments. This is not the best practice, especially when many people are involved in contributing changes. It’s quiet a hassle to manage the migrations as a team. Everyone could have made changes that affect the scheme when others are accessing it at the same time. This could cause inconsistency of database scheme. To fix this, we can edit field or other change directly into database. Or, we could equate state in stages by copying the database from production database into development and staging as we would not be able to reset the migration at this point.

Common errors related Django migrations:

Relation table_name already exist, this happen when the migration files executed an action to create a model but a table already exist in the database, so django migration neglect the action and shows the error. To combat this, you can push two migrations — the migration file causing the error and the previous migration. Move the existing table’s Django syntax into the previous one inside migration.CreateModel function.
The field that will be created already exists. In this case, Django notes that column_name of relation table_name exists. The fix is similar with issue 1, where relation already exists — migrate the error-file and the previous file using migration.AddField function. However, you will also need to modify the previous file as needed.
Django register column and relation as non-existent. This happen when the migration adds a foreign key or constraint to a certain field but django couldn’t execute this because the column referenced as the foreign key doesn’t exist. We have to modify the table’s field manually in the database as it would require multiple attempts through to handle it via migration-approach.

Best practices implemented so far:

Make sure to communicate well with your team when modifying models. This will reduce the changes conflict with one another.
Commit migration file into repo. This may be tricky in local development — every developer has to pull and apply changes from other environments to his local one. However, this will keep the code up-to-date and reduce conflicts in the master file.
Squash migration periodically. It will cut the time used in deploying the migration commands. Django runs each migration file, one by one, from beginning to the current state of the database schema.
When conflict occurs, save the migration files on both test and deploy environments. This way, you can observe conflict issues.
Separate migration command per apps to avoid circular dependency. When running apps, run those with no dependency first, followed by apps that are related to it.
Deploy sequentially from development, staging then production. Do not skip any releases that may cause the database state to differ in each stage. You will not have to treat and migrate each release differently.

With these proposed approaches, we hope that you will be able to solve any on-going issues. Django migrations are helpful to manage changes automatically. This enables us to control different versions of the file, such that we can modify or roll it back if needed. But still it could be burdensome if you not prepared since Django are very strict with error. One small mistake could fires error 500.

We are still scaling our application. There are still many features to be added or modified. That’s why maintaining back-end services, such as data and migrations, is important.

At the end of the day, it is thanks to the team and the developer community that we are able to solve these issues.

Join our community on Telegram!

Learn more about HARA

Visit the HARA website
Read our White Paper
Join our Telegram Community
Follow our Social Media
Facebook, Twitter, Instagram, Medium & LinkedIn

What we learn from handling database using Django

Join our community on Telegram!

Learn more about HARA

Written by Muhamad Farikhin