Django REST efficient bulk create

Amir Ayat
3 min readMay 5, 2023

This story will discuss how to have POST requests with efficient queries in DRF.

Please check the source code on GitHub.

Django REST efficient bulk create

Database query efficiency is a critical aspect of database performance. A database query is a request for information from a database, and the efficiency of the query determines how quickly and accurately the requested data is returned. An efficient query retrieves the required data in the shortest possible time, with the least amount of system resources.

Several factors can impact the efficiency of a database query. One of the most significant is the complexity of the query itself. The more complex the query, the longer it may take to process and return results. This is especially true if the query involves joining multiple tables, sorting, or grouping data.

Actions speak louder than words. Let’s investigate a project for a better understanding.

Consider a reservation system that allows users to book a room for a date.

models.py

class Room(models.Model):
"""
model class for rooms
"""
ROOM_TYPE = (
("Single", "Single"),
("Double", "Double"),
("Triple", "Triple"),
("Quad", "Quad"),
("VIP", "VIP")
)
type = models.CharField(choices=ROOM_TYPE, max_length=6)
price = models.PositiveIntegerField()

class Meta:
db_table = 'room'


class Reservation(models.Model):
"""
model class for users reservations
"""
date = models.DateField()
room = models.ForeignKey(Room, on_delete=models.CASCADE)
reservationist = models.CharField(max_length=50)
phone = models.CharField(max_length=13)

class Meta:
db_table = 'reservation'
constraints = [
models.UniqueConstraint(
fields=['date', 'room'], name='unique_date_room')
]

What if a user wants to book several rooms? The first search will bring you this code snippet:

views.py

def create(self, request, *args, **kwargs):
many = isinstance(request.data, list)
serializer = self.get_serializer(data=request.data, many=many)
serializer.is_valid(raise_exception=True)
reservation_checker(serializer.initial_data, many)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response({"status": "Done."},
status=status.HTTP_201_CREATED,
headers=headers)

By checking the number of sent queries, you will find that your code has sent an insertion request for each entry, which was predictable. Because DRF serializers are typically designed to operate on one object at a time. One solution is to override the perform_create() as follow:

views.py

def perform_create(self, serializer):
data_set = list()
validated_data = serializer.validated_data
if not isinstance(validated_data, list):
validated_data = [validated_data]
for data in validated_data:
date = data["date"]
room_id = data["room"]
reservationist = data["reservationist"]
phone = data["phone"]
data_set.append(
Reservation(
room_id=room_id,
reservationist=reservationist,
phone=phone,
date=date
)
)
reservations = Reservation._default_manager.bulk_create(data_set)
return reservations

The situation has significantly improved, but we still have similar duplicate queries.

A little database check will show us that the serializer sends a query to validate the existence of related objects every time, which here is the room relation. So let’s change our serializer from bare ModelSerializer to this one:

serializers.py

class MakeReservationSerializer(serializers.ModelSerializer):
"""
serializer class to make reservations
"""
room = serializers.IntegerField()

@cached_property
def all_rooms(self):
queryset = Room.objects.all().values("id")
rooms = [room['id'] for room in queryset]
return rooms

def validate(self, attrs):
_room = attrs.get("room")
if _room not in self.all_rooms:
raise serializers.ValidationError(
f"Invalid room {_room} - object does not exist.")
return super().validate(attrs)

class Meta:
model = Reservation
fields = [
"date",
"room",
"reservationist",
"phone"
]

That’s great now. But pay attention that we have replaced multiple tiny queries with a huge one! These screenshots show the situation well.

multiple duplicate queries profile (red)
multiple duplicate queries (red)
single all query profile (red)
single all query (red)

For the last step, we should cache that huge query.

--

--