A study on Django queries to improve admin load times
Aggregators can be a problem, let’s understand why and how to handle this situation
One of these days I noticed that after adding annotated columns to a model’s Manager
the load time of its Django admin’s pages was slower than before.
To investigate what could be happening I used Django Debug Toolbar to take a look at the SQL being executed and got curious about this part of the result:
I am no pro at database management but I know that eachJOIN
requires a lookup in another table which impacts the query execution and those in the image are there exactly because of the Case
annotations I used (which normalize some data based on the model’s related fields values).
Also, we have a little message saying “2 similar queries. Duplicated 2 times.” which means that we end up using double the time for this operation because it’s repeated in some other function while the request is being processed.
From the Django Debug Toolbar panel, by expanding both “similar queries”, the source code trace shows that they are operations to count the page’s object lists hence why they are similar. One is for the quantity of items of each page at .../core/paginator.py:87
and the other is for the total quantity of items of all pages (result of the attribute show_full_result_count
of ModelAdmin
, which you could disable) at .../admin/views/main.py:213
.
The source code mentioned above shows that these operations are pretty entangled in Django admin’s structure and optimizing them directly will require inheriting classes, overriding functions, copying their original content and tweaking the block which handles the QuerySet
.
My first insight was that annotations don’t change the quantity of elements of the queryset so their operations aren’t needed here, and I could work on a way to remove them before the queries are evaluated.
After much fiddling with Queryset.query
I managed to “remove” annotations but they mess with the query structure in the first place which made full optimization not possible by this approach…
Looking for other possibilities based on my initial premise, I thought:
- Annotations don’t modify the number of results so instead of removing them, they might not even be needed on the first place
- What changes the number of results are the filters, so if it’s possible to copy them, it’s also possible to create a new
QuerySet
without annotations
So I figured that queryset.model.objects.all()
provides a QuerySet
with no annotations and the filters could be re-applied by copying the attributes queryset.query.where
and queryset.query.where_class
from the original to the new QuerySet
. Below is an implementation of this behavior:
After applying this optimization (example is linked at the bottom of this story) a comparison can be made showing the achieved results in page loading times and queries being executed:
No JOIN
, no problem. Without the annotations it takes only about 4% of the initial time and the SQL query is much simpler too, doing only the COUNT
operation that is desired at these functions.
Each query execution time went from ~140ms to ~7ms which is an improvement of around 190% on total page load time.
The next step is to turn all of this into a mixin which will let ModelAdmin
classes effortlessly inherit the optimization. For this, both ChangeList
and Paginator
base Django classes need to be extended and changed to include the copy_queryset_without_annotations
function explained above and these new classes will be part of the the final mixin class.
Please consider looking at the complete source code for this story that is linked below, which you can use in your projects and even contains a bonus improvement that uses Queryset.defer
to further decrease SQL overhead.
All the source code shown and explained here (mixins, functions, examples, samples, etc.) is available in the GitHub repository below along with other helpful projects I’ve developed and wrote about. Feel free to check it out, use them in your own work and help me improve them with your feedback.
🍋 🍋 🍋