JSONField Schema Validation in Django Rest Framework using Insomnia

Mohammed Kamil Khan
Analytics Vidhya
Published in
7 min readDec 2, 2021

--

Prior to familiarizing oneself with measures that can be taken to validate jsonfields in DJANGO Rest Framework, one can’t help but notice that in addition to other non-json fields, when POST-ing, it doesn’t matter whether you’ve included:

  1. An empty jsonfield or
  2. A filled jsonfield with various types of values

Let us consider the following example to understand the above:

models.py:

from django.db import modelsclass Student(models.Model):
student_name=models.CharField(max_length=100)
roll_no=models.IntegerField()
detailed_data=models.JSONField()

serializers.py:

from rest_framework import serializers
from studentapp.models import Student
class StudentSerializer(serializers.ModelSerializer):
class Meta:
model=Student
fields='__all__'

views.py:

from studentapp.serializers import StudentSerializer
from rest_framework.views import APIView
from rest_framework.response import Response
from django.http import JsonResponse,HttpResponse
from rest_framework import status
class StudentView(APIView):
def post(self,request):
serializer=StudentSerializer(data=request.data)
if serializer.is_valid():
serializer.save()
return Response(serializer.data,status=status.HTTP_201_CREATED)#inside if serializer.is_valid()
return Response(serializer.errors,status=status.HTTP_400_BAD_REQUEST) #when serializer is not valid

urls.py

from django.contrib import admin
from django.urls import path,include
from studentapp.views import StudentView
urlpatterns = [
path('student/',StudentView.as_view()),
]

As can be seen in the Student model above, we have 3 fields: One a character field, the other an integer and lastly a jsonfield. With the help of the snips below, we’ll see how the two points mentioned in the beginning come into play.

When entering empty value into our jsonfield “detailed_data”
When entering a key-value pair into out jsonfield
When entering a list of dictionaries

There seems to be no defined structure that our JSONField field “detailed_data” follows. This is where the keyword “schema” kicks in. Schema is the outline or the structure that the JSONField field needs to follow. If the data/values we enter into “detailed_data” in Insomnia abides by the structure (schema), then we can say that the validation of “detailed_data” is a success, however, if the value entered for a field within the “detailed_data” is of atype different from the one defined in the schema, or if any field(s) within “detailed_data” is/are missing; it implies that we have encountered validation error(s). Let’s try to understand this by deep-diving into the “detailed_data” field within out Student model.

Let’s follow the below layout for the fields within our “detailed_data” JSONField:

  1. Date of Joining — Date format
  2. Mother’s name — Character type
  3. Father’s name — Character type
  4. Teacher Information — A dictionary comprising of: Name, Date of Birth, Department, Date of Joining
  5. Semester 1 subjects — A list of dictionaries, with each dictionary comprising of: subject name, score
  6. Semester 2 subjects — Another list of dictionaries with the same structure and subfields as mentioned in 3.

Given the above, our objective is to first check if the serialization (concerning “student_name” and “roll_no” is valid (by basic serialization check), followed by checking for the validity of the contents of “detailed_data” by comparing with schema. You can do either of the following: manually write the schema (or) use JSONModels. The latter would be recommended since it auto-generates the schema, but to get an idea of what the schema looks like, let’s go through both methods:

  1. Manually writing the schema

The schema for the previously mentioned layout would be as follows:

{   "type":"object",
"properties": {
"date_of_joining": {"type": "string","pattern":'^[0-9]{2}-[0-9]{2}-[0-9]{4}$'},
"mother_name":{"type":"string"},
"father_name":{"type":"string"},
"teacher_information": {
"type":"object",
"properties":{
"name": {"type": "string"},
"date_of_birth": {"type":"string","pattern":'^[0-9]{2}-[0-9]{2}-[0-9]{4}$'},
"department":{"type":"string"},
"joining_date":{"type":"string","pattern":'^[0-9]{2}-[0-9]{2}-[0-9]{4}$'}
},
"required":["name","date_of_birth","department","joining_date"], --(1)
"additionalProperties": False
},
"semester_1_subjects":{
"type":"array",
"items":{
"type":"object",
"properties":{
"subject_name":{"type":"string"},
"professor":{"type":"string"},
"score":{"type":"integer"}
},
"required":["subject_name","professor","score"], --(2)
"additionalProperties":False
}
},
"semester_2_subjects":{
"type":"array",
"items":{
"type":"object",
"properties":{
"subject_name":{"type":"string"},
"professor":{"type":"string"},
"score":{"type":"integer"}
},
"required":["subject_name","professor","score"], --(3)
"additionalProperties":False
}
}
},
"required":["date_of_joining","mother_name","father_name","teacher_information","semester_1_subjects","semester_2_subjects"], --(4)
"additionalProperties":False
}

The list of fields which serve as ‘value’ to the ‘required’ key in (1) imply that those fields (within teacher_information) MUST be included, if even one of them is missed out, a validation error is generated. Same applied for lines (2) and (3). (4), however, deals with the outer layer of fields. (1), (2) and (3) deal with compulsion of inclusion of fields WITHIN “teacher_information”,”semester_1_subjects” and “semester_2_subjects”, (4) deals with the same but for the inclusion of the aforementioned fields on the whole.

Now let’s include our schema in views.py and use Draft7Validator for carrying out comparison of what we are entering against the defined schema.

import jsonschema
from jsonschema import Draft7Validator
class StudentView(APIView):
def post(self,request):
data=request.data
serializer=StudentSerializer(data=request.data)
if serializer.is_valid():
myschema= #insert above defined schema here
v=Draft7Validator(myschema)
if len(list(v.iter_errors(data["detailed_data"])))!=0:
return Response({"error":str(list(v.iter_errors(data["detailed_data"])))})
else:
serializer.save()
return Response(serializer.data,status=status.HTTP_201_CREATED)
else:
return Response(serializer.errors,status=status.HTTP_400_BAD_REQUEST)
An example of correct type of data POST-ing

In the above image, you can observe that values for all the fields are exactly of the types written in the schema, additionally, there are no missing fields altogether, which resulted in the particular entry being saved without any validation errors. Let’s try making a few changes in to see how the validation errors are shown. Let’s not include the first “professor” key-value pair in “semester_2_subjects”, same for second “subject_name” within “semester_1_subjects”. Let’s write an integer type value for “department” within “teacher_information” and a wrong pattern for “date_of_joining”.

Display of ValidationError messages

With this we have covered the various types of validation errors.

This method involves us writing out the schema manually, but what if you dint wish to write the schema yourself and wanted something to do it for you? This is where JSON Models come into play.

2. Using JSON Models

In order to be able to generate the previous schema, we need to make the following additions in models.py

from jsonmodels import models as m1,fieldsclass teacherField(m1.Base):
name=fields.StringField(required=True)
date_of_birth=fields.DateField(required=True)
department=fields.StringField(required=True)
joining_date=fields.DateField(required=True)
class sem1Field(m1.Base):
subject_name=fields.StringField(required=True)
professor=fields.StringField(required=True)
score=fields.IntField(required=True)
class sem2Field(m1.Base):
subject_name=fields.StringField(required=True)
professor=fields.StringField(required=True)
score=fields.IntField(required=True)
class detailed_data_field(m1.Base):
date_of_joining=fields.DateField(required=True)
mother_name=fields.StringField(required=True)
father_name=fields.StringField(required=True)
teacher_information=fields.EmbeddedField(teacherField)
semester_1_subjects=fields.EmbeddedField(sem1Field)
semester_2_subjects=fields.EmbeddedField(sem2Field)

Our new views.py:

class StudentView(APIView):
def post(self,request):
data=request.data
serializer=StudentSerializer(data=request.data)
if serializer.is_valid():
data1=detailed_data_field()
myschema=data1.to_json_schema()
print(myschema)
v=Draft7Validator(myschema)
if len(list(v.iter_errors(data["detailed_data"])))!=0:
return Response({"error":str(list(v.iter_errors(data["detailed_data"])))})
else:
serializer.save()
return Response(serializer.data,status=status.HTTP_201_CREATED)
else:
return Response(serializer.errors,status=status.HTTP_400_BAD_REQUEST)

The generated schema:

{
"type": "object",
"additionalProperties": False,
"properties": {
"date_of_joining": {"type": "string"},
"father_name": {"type": "string"},
"mother_name": {"type": "string"},
"semester_1_subjects": {
"type": "object",
"additionalProperties": False,
"properties": {
"professor": {"type": "string"},
"score": {"type": "number"},
"subject_name": {"type": "string"},
},
"required": ["professor", "score", "subject_name"],
},
"semester_2_subjects": {
"type": "object",
"additionalProperties": False,
"properties": {
"professor": {"type": "string"},
"score": {"type": "number"},
"subject_name": {"type": "string"},
},
"required": ["professor", "score", "subject_name"],
},
"teacher_information": {
"type": "object",
"additionalProperties": False,
"properties": {
"date_of_birth": {"type": "string"},
"department": {"type": "string"},
"joining_date": {"type": "string"},
"name": {"type": "string"},
},
"required": ["date_of_birth", "department", "joining_date", "name"],
},
},
"required": ["date_of_joining", "father_name", "mother_name"],
}

As far as I’ve read regarding the different types of fields Josn Models have to offer, there isn’t any which makes a “ ‘type’:’array’ “ be generated for the assigned field in the schema. For example, our “semester_1_subjects” and semester_2_subjects” fields have a type:object for them in the schema, when we need type:array. Let’s compare the section of semester_1_subjects from the 1st method schema with that present in the above schema

# FROM NEW SCHEMA "semester_1_subjects": {
"type": "object",
"additionalProperties": False,
"properties": {
"professor": {"type": "string"},
"score": {"type": "number"},
"subject_name": {"type": "string"},
},
"required": ["professor", "score", "subject_name"],
}
#FROM OLDER SCHEMA"semester_1_subjects":{
"type":"array",
"items":{
"type":"object",
"properties":{
"subject_name":{"type":"string"},
"professor":{"type":"string"},
"score":{"type":"integer"}
},
"required":["subject_name","professor","score"],
"additionalProperties":False
}
}

We need to make the new schema for semester_1_subjects look like the older one. Same goes for semester_2_subjects field. For this, we make certain changes in our views so that we slightly manipulate the new auto-generated schema to look like the older one. In the new schema, we can see that we don’t have “semester_1_subjects”,”semester_2_subjects” and “teacher_information” included in the last “required” field, and the fields related to date do not have any constrainst such as format etc., it just has “type”:”string”. Let’s rectify all these by making manipulations in the schema that is being handed over to us with the help of JSON Models.

Our updated views.py:

class StudentView(APIView):
def post(self,request):
data=request.data
serializer=StudentSerializer(data=request.data)
if serializer.is_valid():
data1=detailed_data_field()
myschema=data1.to_json_schema()
myfields=['semester_1_subjects','semester_2_subjects']
for i in myfields:
d={}
d["items"]=myschema["properties"][i]
myschema["properties"][i]=d
myschema["properties"][i]["type"]="array"
#end of loop
myschema["required"].extend(["teacher_information","semester_1_subjects","semester_2_subjects"])
myschema["properties"]["date_of_joining"]["pattern"]='^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6}$'
myschema["properties"]["teacher_information"]["date_of_birth"]='^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{6}$'
v=Draft7Validator(myschema)
if len(list(v.iter_errors(data["detailed_data"])))!=0:
return Response({"error":str(list(v.iter_errors(data["detailed_data"])))})
else:
serializer.save()
return Response(serializer.data,status=status.HTTP_201_CREATED)
else:
return Response(serializer.errors,status=status.HTTP_400_BAD_REQUEST)

Now let’s go through the same invalid entering of values and missing of fields as we did before to check for effective validation.

It’s working perfectly! There may be fields within jsonmodels which combines embedded field with a feature that make for a “type”:”array”, but from what I’ve read so far, there isn’t any. If there is something which helps to achieve the same, do let it be known in the comment section!

--

--

Mohammed Kamil Khan
Analytics Vidhya

Currently a final year CS student from Hyderabad, India. Looking forward to publishing articles that would primarily benefit beginners out there!