Prefetch Related and Select Related in Django

Nensi Trambadiya
CodeptiveSolutions
Published in
4 min readOct 31, 2019

When Django fetches an object, it does not fetch related objects of that object. It will make separate queries for all related objects on access time. This behavior is not good in all cases.

Let’s understand this using a practical example.

First, we add below logging to the settings.py file to see all SQL queries call behind the Django ORM.

LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'console': {
'level': 'DEBUG',
'class': 'logging.StreamHandler',
}
},
'loggers': {
'django.db.backends': {
'handlers': ['console'],
'level': 'DEBUG',
},
}
}

We will consider these models for all select related examples below, so we can always come back here and read it again.

from django.db import modelsclass Author(models.Model):
name = models.CharField(max_length=50)
class Book(models.Model):
name= models.CharField(max_length=50)
author = models.ForeignKey(Author,on_delete=models.CASCADE)

It’s pretty simple, right? The Book has a foreign key relationship with the Author table.

>>> book = Book.objects.get(id=1)
(0.000) SELECT "django_orms_book"."id", "django_orms_book"."name", "django_orms_book"."author_id" FROM "django_orms_book" WHERE "django_orms_book"."id" = 1; args=(1,)
>>> book.author.name
(0.000) SELECT "django_orms_author"."id", "django_orms_author"."name" FROM "django_orms_author" WHERE "django_orms_author"."id" = 1; args=(1,)
'Author1'
>>> book.author.name
'Author1'

As above, a separate query for the author is executed when we access the name of the author. This separate queries for the related objects decrease the performance of an application. Consider we have 1000 books and we have to create a list of books with an author name. For each time we access one foreign key that is not in the cache, another query will be made to retrieve the value. So, in the end, 1001 queries will execute to fetch the list of books.

We can reduce 1001 queries to 1 query using select_related. 1

Select Related

Let’s create a query that fetches all books with the name of an author in 1 query.

from django_orms.models import *
from django.db.models import F
def get_all_books(): books = Book.objects.select_related('author').annotate(
author_name=F('author__name')
).values('id', 'name', 'author_name')
print(books)

I have defined function get_all_books() to fetch all books with an author name.

Now I’m calling this function from shell to see the number of queries.

>>> get_all_books()
(0.001) SELECT "django_orms_book"."id", "django_orms_book"."name", "django_orms_author"."name" AS "author_name" FROM "django_orms_book" INNER JOIN "django_orms_author" ON ("django_orms_book
"."author_id" = "django_orms_author"."id") LIMIT 21; args=()
<QuerySet [{'id': 1, 'name': 'Book1', 'author_name': 'Author1'}, {'id': 2, 'name': 'Book2', 'author_name': 'Author2'}, {'id': 3, 'name': 'Book3', 'author_name': 'Author3'}, '...(remaining elements truncated)...']>

As you can see in the output, only one join query has called to fetch all books. It’s a big improvement for an application.

We can not use select_related formany to many relationships, do you remember? To improve our queries we need to use a new method called prefetch_related .

Now, Let’s update the models add new field publishers in the Book model and change the name to the Person model.

from django.db import modelsclass Person(models.Model):
name = models.CharField(max_length=50)
class Book(models.Model):
name= models.CharField(max_length=50)
author = models.ForeignKey(Person,on_delete=models.CASCADE)
publishers = models.ManyToManyField(Person, related_name='publishers')

Prefetch Related

We can use the prefetch_related method with many to many relationships to improve performance by reducing the number of queries.

def get_all_books():    books = Book.objects.prefetch_related('publishers')    for book in books:
print(book.publishers.all())

Let’s call this function and see what happened.

>>> get_all_books()
(0.001) SELECT "django_orms_book"."id", "django_orms_book"."name", "django_orms_book"."author_id" FROM "django_orms_book"; args=()
(0.000) SELECT ("django_orms_book_publishers"."book_id") AS "_prefetch_related_val_book_id", "django_orms_person"."id", "django_orms_person"."name" FROM "django_orms_person" INNER JOIN "dja
ngo_orms_book_publishers" ON ("django_orms_person"."id" = "django_orms_book_publishers"."person_id") WHERE "django_orms_book_publishers"."book_id" IN (1, 2); args=(1, 2)
<QuerySet [<Person: Person object (2)>, <Person: Person object (3)>]>
<QuerySet [<Person: Person object (1)>, <Person: Person object (3)>]>

As you can see 2 queries executed using prefetch_related.

Let’s change function a little bit and add values() instead of all().

def get_all_books():    books = Book.objects.prefetch_related('publishers')    for book in books:
print(book.publishers.values('id', 'name'))

Let’s see the output.

>>> get_all_books()
(0.000) SELECT "django_orms_book"."id", "django_orms_book"."name", "django_orms_book"."author_id" FROM "django_orms_book"; args=()
(0.000) SELECT ("django_orms_book_publishers"."book_id") AS "_prefetch_related_val_book_id", "django_orms_person"."id", "django_orms_person"."name" FROM "django_orms_person" INNER JOIN "dja
ngo_orms_book_publishers" ON ("django_orms_person"."id" = "django_orms_book_publishers"."person_id") WHERE "django_orms_book_publishers"."book_id" IN (1, 2); args=(1, 2)
(0.000) SELECT "django_orms_person"."id", "django_orms_person"."name" FROM "django_orms_person" INNER JOIN "django_orms_book_publishers" ON ("django_orms_person"."id" = "django_orms_book_pu
blishers"."person_id") WHERE "django_orms_book_publishers"."book_id" = 1 LIMIT 21; args=(1,)
<QuerySet [{'id': 2, 'name': 'Person2'}, {'id': 3, 'name': 'Person3'}]>
(0.000) SELECT "django_orms_person"."id", "django_orms_person"."name" FROM "django_orms_person" INNER JOIN "django_orms_book_publishers" ON ("django_orms_person"."id" = "django_orms_book_pu
blishers"."person_id") WHERE "django_orms_book_publishers"."book_id" = 2 LIMIT 21; args=(2,)
<QuerySet [{'id': 1, 'name': 'Person1'}, {'id': 3, 'name': 'Person3'}]>

The number of queries increased to 4. This is because of the values() method instead of all(). 1 query is executed to fetch publishers in each iteration.

By default, prefetch related joined all results, but here we used the values('id','name') and because of that Django does not join the right results for us.

from django.db.models import Prefetch
def get_all_books():
books = Book.objects.prefetch_related(
Prefetch(
'publishers',
queryset=Person.objects.only('id', 'name'),
to_attr='all_publishers'
)) for book in books:
print(book.all_publishers)

I have added Prefetch to set new attributes for us. We are fetching values from publishers field using Person.objects.only('id','name') as base queryset and telling Django to we want all prefetched values in the all_publishers attribute.

Let’s check the output.

>>> get_all_books()
(0.000) SELECT "django_orms_book"."id", "django_orms_book"."name", "django_orms_book"."author_id" FROM "django_orms_book"; args=()
(0.001) SELECT ("django_orms_book_publishers"."book_id") AS "_prefetch_related_val_book_id", "django_orms_person"."id", "django_orms_person"."name" FROM "django_orms_person" INNER JOIN "dja
ngo_orms_book_publishers" ON ("django_orms_person"."id" = "django_orms_book_publishers"."person_id") WHERE "django_orms_book_publishers"."book_id" IN (1, 2); args=(1, 2)
[<Person: Person object (2)>, <Person: Person object (3)>]
[<Person: Person object (1)>, <Person: Person object (3)>]

YESS 😃. The problem is solved using Prefetch.

We can also do the same thing without the to_attr attribute of the Prefetch. Below is a function without to_attr.

from django.db.models import Prefetch
def get_all_books():
books = Book.objects.prefetch_related(
Prefetch(
'publishers',
queryset=Person.objects.only('id', 'name'),
)) for book in books:
print(book.publishers.all())

This gives the same output as above.

I hope you learned a bit more about prefetch and select related.

Thanks for reading this article. If you like it, click on 👏 to rate it out of 50 and also share it with your friends. It means a lot to me.

Would you like to check out my other articles?

--

--

Nensi Trambadiya
CodeptiveSolutions

Senior Software Engineer | Python Developer | Django | Flask | DRF | AWS | PostgreSQL | ETL | Elasticsearch | Kafka | Docker | Git | Faust | Microservices