Basic Performance Optimization in Django

10 min readJan 7, 2019

TL;DR: Profile your code, make fewer SQL queries, serialize less data

I recently built Carta’s Scenario Modeling platform. A little ways into the project I realized I didn’t know how to write performant code. I still don’t, but here are some things I wish I knew before I got started.

Disclaimer

The sections below are high-level explanations meant to expose you to different techniques you can use to optimize your Django code. Results will vary depending on your database, models, and the nature of your data.

Getting Started — example data model

class Library(models.Model):
    name = models.CharField(max_length=200, default='')
    address = models.CharField(max_length=200, default='')
class Author(models.Model):
    name = models.CharField(max_length=200, default='')
class Book(models.Model):
    library = models.ForeignKey(
        Library,
        on_delete=models.CASCADE,
        related_name='books',
    )
    author = models.ForeignKey(
        Author,
        on_delete=models.CASCADE,
        related_name='books'
    )
    title = models.CharField(max_length=200, default='')
    address = models.CharField(max_length=200, default='')    def get_page_count(self):
        return self.pages.count()
class Page(models.Model):
    book = models.ForeignKey(
        Book,
        on_delete=models.CASCADE,
        related_name='pages',
    )
    text = models.TextField(null=True, blank=True)
    page_number = models.IntegerField()

I created a simple data model consisting of books, pages, authors, and libraries so we can play with the techniques outlined below. The database is seeded with 10,000 books, 1,000,000 pages, 1,000 authors, and 1,000 libraries. The pages are randomly assigned to books and the books are randomly assigned to authors and libraries.

Use line_profiler

If we don’t know why our code is slow its going to be difficult to figure out how to optimize it. line_profiler is a cool python module that tells us how much time it takes to execute each line in a function. Before you get started, install the package with pip install line_profiler.

Here’s how I typically use it. Imagine we needed a function which returned all of the books corresponding to each library in our database. Let’s say we ended up with something like this…

def get_books_by_library_id():
    libraries = Library.objects.all()
    result = {}
    
    for library in libraries:
        books_in_library = library.books.all()
        result[library.id] = list(books_in_library)    return result

…which was causing the below view to run slowly.

from django.http import HttpResponse
from test_app.helpers import get_books_by_library_id
def books_by_library_id_view(request):
    books_by_library_id = get_books_by_library_id()    ...    return HttpResponse(response)

To find out where the computer is spending its time during the execution of the function you can add an IPython debugger to the view right before where the suspect function is called…

from django.http import HttpResponse
from test_app.helpers import get_books_by_library_iddef books_by_library_id_view(request):
    from IPython import embed; embed()
    books_by_library_id = get_books_by_library_id()    ...    return HttpResponse(response)

…perform whatever action triggers the view, wait for your server to stop running and the IPython shell to start up, load line_profiler as an extension into the IPython shell, and then profile the function with lprun -f your_function_name your_function_name(). If any arguments need to be passed into the function, pass them inside the parentheses of the last part of that command.

The % Time column tells you how much of the Total time the function spent executing that line of code. So out of the 47.977 seconds it took to execute the entire function we spent 98.8% of the total time executing line 6, where we evaluate the library.books.all() queryset by coercing it to a list.

The Hits column tells you how many times that line of code was executed. It looks like line 6 is executed 1000 times — one time for each Library in the database. That means we’re making 1000 SQL queries which is probably why this function is so slow. We’ll cover how to get make this faster in later sections.

Turn on SQL logging

When we’re really digging into a function to see where the bottlenecks are it can be useful to turn on SQL logging.

# settings.pyLOGGING = {
    'version': 1,
    'filters': {
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        }
     },
     'handlers': {
        'console': {
            'level': 'DEBUG',
            'filters': ['require_debug_true'],
            'class': 'logging.StreamHandler',
         }
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console'],
        }
    }
}

We can add this snippet to settings.py to get a full printout of every SQL query executed in your code. It’s usually way too much to keep on at all times but it can be enlightening to see how many queries are being executed and how long its taking to execute each of them in a given block of code.

The way I typically use SQL logging is to put a debugger before and after a suspect block of code. I then execute the code, clear my terminal once I hit the first debugger, continue through the first debugger, and investigate the queries that were executed in between while keeping an eye out for repeated, unexpected, or slow queries.

This post wont cover SQL query optimization but we will go over how to reduce or eliminate the number of repeated queries in a function.

Avoid queries in loops

def get_books_by_library_id():
    libraries = Library.objects.all()
    result = {}
    
    for library in libraries:
        books_in_library = library.books.all()
        result[library.id] = list(books_in_library)    return result

Back to our terminally ill function get_books_by_library. The problem with this is we’re making a round trip to the database to fetch the books for each library. This might be what you want if you only have a few libraries each with millions of books. But if you have thousands of libraries each with thousands of books, your code is now executing thousands of queries. To get around this we can grab all of the books up front:

def get_books_by_library_id_one_query():
    books = Book.objects.all()
    result = defaultdict(list)
    
    for book in books:
        result[book.library_id].append(book)    return result

Boom. Now you’re only making one SQL query regardless of how many libraries exist in your database.

So how do we know if this is any faster? Python has this cool little module called timeit that tells you how long it takes to execute a function.

In [12]: timeit(get_books_by_library_id, number=10)
Out[12]: 6.598360636999132
In [13]: timeit(get_books_by_library_id_one_query, number=10)
Out[13]: 0.677092163998168

Even though our second function has to loop over 10,000 books, it still runs nearly 10x faster than the original function by eliminating round trips to the database. Keep in mind this all depends on the sparseness/density and scale of your data. In my experience, code generally runs slower if the number of SQL queries made scales with some other input.

select_related()

Imagine you wanted a list of strings in the format of Harry Potter and the Sorcerer's Stone by J.K. Rowling corresponding to each library.

Django makes it easy to write code that executes unnecessary queries. If you don’t know what’s happening under the hood or you’re working with unfamiliar models, you might write something like this:

def get_books_by_author():
    books = Book.objects.all()
    result = defaultdict(list)    for book in books:
        author = book.author
        title_and_author = '{} by {}'.format(
            book.title,
            author.name
        )
        result[book.library_id].append(title_and_author)    return result

Here’s the problem: each time you access book.author you’re making a query equivalent to Author.objects.get(id=book.author_id). If you’re looping over thousands of books you’re also making thousands of completely unnecessary queries. Use select_related to avoid this:

def get_books_by_author_select_related():
    books = Book.objects.all().select_related('author')
    result = defaultdict(list)    for book in books:
        author = book.author
        title_and_author = '{} by {}'.format(
            book.title,
            author.name
        )
        result[book.library_id].append(title_and_author)    return result

select_related works by making a single more complex SQL query which also returns the fields of the related object. So instead of just getting back data about all of the books, you’re also grabbing data for each book’s author as well. Now when you access book.author you’re actually accessing a cached version of the author rather than making a separate database query.

How much faster is the naive approach vs the approach using select_related? I ran both of these functions using the timeit module and found that the function using select_related was 32x faster:

In [12]: timeit(get_books_by_author, number=10)
Out[12]: 41.363460485998075
In [13]: timeit(get_books_by_author_select_related, number=10)
Out[13]: 1.2787263889913447

select_related wont work in all situations however:

to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.

In order to traverse reverse ForeignKey or ManyToMany relationships we’ll need prefetch_related.

prefetch_related()

prefetch_related is similar to select_related in that it prevents unnecessary SQL queries. Instead of fetching the primary and related objects in one go, prefetch_related makes separate queries for each relationship and “joins” the results together in python. The downside of this approach is it requires multiple round trips to the database.

Author.objects.filter(name__startswith='R').prefetch('books')

How it works: first a request is fired off which runs the primary query Author.objects.filter(name__startswith=letter). Then a second query equivalent to: Book.objects.filter(author_id__in=PKS_OF_AUTHORS_FROM_FIRST_REQUEST) is executed and both of the responses are merged together into an Author queryset that has each author’s books cached in memory. So you end up with a similar result as with select_related but you arrive there through different means.

Although you can use prefetch_related anywhere you use select_related, generally your code will run faster with select_related since it will grab everything you need in one SQL query. But if your data is particularly sparse (millions of books to just a few libraries) you may see a performance boost despite the extra trips to the database. So when in doubt, try both methods and see which one comes out on top.

To summarize: use select_related and prefetch_related if you’re going to access a model’s relationship after making an initial query. Use prefetch_related if you’re following a reverse ForeignKey or ManyToMany relationship. Use select_related if you’re following a ForeignKey or OneToOne relationship but keep in mind that prefetch_related could be a better choice for these types of relationships depending on the nature of your data.

values() and values_list()

The time it takes to serialize SQL responses into python scales with both the number of rows and columns being returned. In the function below, we’re serializing all of the fields defined on the book and author even though we only need the author’s name and the book’s library_id and title. We’re also are initializing a Django model instance for no reason since we’re not doing anything special with it (like calling model methods).

def get_books_by_author_select_related():
    books = Book.objects.all().select_related('author')
    result = defaultdict(list)    for book in books:
        author = book.author
        title_and_author = '{} by {}'.format(
            book.title,
            author.name
        )
        result[book.library_id].append(title_and_author)    return result

Because of this, we incur pretty significant overhead that can be eliminated by calling .values() or .values_list() on the queryset to ask only for the fields we need:

def get_books_by_author_select_related_values():
    books = (
        Book.objects
         .all()
         .select_related('author')
         .values('title', 'library_id', 'author__name')
    )
    result = defaultdict(list)    for book in books.iterator():
        title_and_author = '{} by {}'.format(
            book['title'],
            book['author__name']
        )
        result[book['library_id']].append(title_and_author)
    
    return resultdef get_books_by_author_select_related_values_list():
    books = (
        Book.objects
         .all()
         .select_related('author')
         .values_list('title', 'library_id', 'author__name')
    )
    result = defaultdict(list)    for book in books.iterator():
        title_and_author = '{} by {}'.format(
            book[0],
            book[2]
        )
        result[book[1]].append(title_and_author)
    
    return result

.values() returns a list of dictionary representations of the model instance: [{'title': 'Snow Crash', 'library_id': 9, 'author__name': 'Neil'}, ...] and .values_list() returns a list of tuples representing the model instance [('Snow Crash', 9, 'Neil'), ...].

So how much faster are these functions? By only grabbing the fields we need, the functions using.values() and .values_list() runs ~7x faster than the original function.

In [13]: timeit(get_books_by_author_select_related, number=10)
Out[13]: 1.2787263889913447
In [14]: timeit(get_books_by_author_select_related_values, number=1)
Out[14]: 0.19064296898432076
In [15]: timeit(get_books_by_author_select_related_values_list, number=1)
Out[15]: 0.17425400999491103

Something to note here is these models are super lightweight — there are only 4 fields on the Book model. If we were working with models which had tons of fields, the results would be more extreme.

Even if you need all of the fields on the object, .values() and .values_list() will still provide a substantial performance boost by skipping model initialization. You can get all of the fields back by not passing in any fields as arguments.

# returns list of model instances
def get_book_instances():
    return list(
        Book.objects
        .all()
        .select_related('author')
    )# returns list of dictionaries representing model instances
def get_book_dictionaries():
    return list(
        Book.objects
        .all()
        .select_related('author')
        .values()
    )
# returns a list of dictionaries with the name of each book
def get_book_dictionaries_title_only():
    return list(
        Book.objects
        .all()
        .select_related('author')
        .values('title')
    )

Getting the dictionary representation of the books is 6.5x faster and asking for a particular field on the book is 8.8x faster than requesting the model instances.

In [64]: timeit(get_book_instances, number=100)
Out[64]: 12.904168864974054
In [65]: timeit(get_book_dictionaries, number=100)
Out[65]: 2.049193776998436
In [66]: timeit(get_book_dictionaries_title_only, number=100)
Out[66]: 1.4734381759772077

Initializing Django model instances is expensive. If you’re only using the data on the model you’re likely better off working with their dictionary or tuple representations.

bulk_create()

This one is pretty simple. If we’re going to create more than one object at a time, use bulk_create instead of creating the objects in a loop. As the name suggests, bulk_create will insert a list of objects into the database using one query, regardless of how many objects we’re inserting.

It’s possible to insert so many objects that the single query generated by bulk_create will be slower than multiple smaller queries. In that case, you can pass in batch_size=SIZE_OF_BATCH as an argument which will break the master query into smaller queries.

I’m not aware of any rules of thumb regarding the number of objects you need to insert before it makes sense to use batch_size or how to decide on the size of the batch. Usually I omit it until it becomes clear that there’s a bottleneck and then determine a suitable batch_size from there.

SQL is (generally) faster than Python

Let’s imagine you wanted a function that returns the total number of pages for each library. Using what we’ve learned above you might end up with something like this:

def get_page_count_by_library_id():
    result = defaultdict(int)    books = Book.objects.all().prefetch_related('pages')    for book in books:
        result[book.library_id] += book.get_page_count()    return result

Even though this will only fire off two queries, we still have to pull all of the books into memory and loop over each one of them. This sort of thing can easily be turned into an annotation.

from django.db.models import Sumdef get_page_count_by_library_id_using_annotation():
    result = {}    libraries = (
        Library.objects
        .all()
        .annotate(page_count=Sum('books__pages'))
        .values_list('id', 'page_count')
    )    for library_id, page_count in libraries:
        result[library_id] = page_count    return result

Now, instead of pulling a bunch of Django instances into memory and calling a model method on each of them, you’re only pulling the two values you actually care about for each Library: the id and page_count. Our new function runs about 115x faster than the original.

In [66]: timeit(get_page_count_by_library_id, number=10)
Out[66]: 158.0743614450039
In [67]: timeit(get_page_count_by_library_id_using_annotation, number=10)
Out[67]: 1.3725216790044215

This example is pretty simple but there’s a lot you can do with annotations. If you’re running into performance issues while doing math on a large queryset, consider writing an annotation to offload that work to the database.

Wrapping up

If you’re working on a Django-based project where there has been little or no thought regarding performance, there should be quite a bit of low hanging fruit to get you started. In the majority of cases, the techniques described above will provide a substantial performance boost without much of a trade off in terms of readability.