We’d want to make sure our donuts’ data is valid, wouldn’t we?

Adding validation support for Json in Django Models

It’s pretty cool that Django has support for JSON based features of PostgreSQL. The JSONField()can be assigned to model attributes to store JSON based data. However, enforcing pre-save validation on this field can be tricky. Doing so in your serializers and that too through using Python’s dicts can be a hassle and get ugly pretty quickly.

Fortunately, there’s a great tool available to validate JSON data. It’s called json-schema and it has implementations in multiple languages, including Python. It can check for data types, make sure that strings match enums, and allow additional properties which might not require any validation. You can also nest multiple data types like having an array of objects with each object having a set number of fields with different data types of their own. Here’s an example of a schema against which we can validate our data:

JSON schema for data holding an employee’s work experience

The idea here is to link a schema to your field against which you could validate your JSON before saving it into the database. We’ll extend the JSONField() provided by Django and add pre-save validation functionality to it.

We’re using an implementation of jsonschema here which validates our data against a schema such as the one defined above. In the constructor, we expect a path to a json file which we then load into memory as a property in the _schema_data method. The pre_save method makes sure that validation is performed before the model instance is saved.

Using this field is pretty straightforward. You’ll need to define your schema and save it before providing the relative path to it as an argument:

Now if we try saving an employee instance with json that doesn’t correspond to the schema defined above, a ValidationError exception will be thrown.

This is a great way to automate json validation instead of doing it in your serializers or elsewhere. We can go further and have a configurable schema property that is different for each model instance. For that, we’d have to pass the schema to the field while saving the instance and manually call our validation method.

Got questions or suggestions? Comment on this post and I’ll try my best to answer them.