My Django page or API is slow, How do I find out why?
Why is it slow?
We often hear how Python’s interpreter is slower than JIT’ed or compiled languages but for nearly all Django web sites it’s not Python that’s to blame, it’s “N+1”.
Formally N+1 is when your API/task does work proportional to N where N is the number of items in the API.
Or more simply it’s when you do SQL calls in a for loop.
This is because the network round trip to the DB dominates other factors. This is true for any central DB server whether it be MySQL, Postgres or even Redis. Even if the DB is lightning fast and has the table pages in memory, it will still be slow due to the time it takes for the request/response to travel over the network to the DB from the app servers.
The above is not specific to Django or Python, you see it in other languages and frameworks. But N+1 is particularly pernicious in Django because relationships will cause SQL calls while looking like normal OOP code.
For example accessing obj.other_object in a for loop will result in SELECT * FROM <other_object’s table> per iteration of the loop.
Those familiar with Django will know that select_related or prefetch_related can help a lot here and so can judicious use of values/values_list rather than model objects.
While these Django optimisations can help it’s not easy to go from “My page/API is slow” to “I need to add a select_related to this line here”.
This is where the tooling we discuss in this article helps.
The above video shows our Chrome developer tool plug-in debugging poorly performing AJAX calls.
An influence on the development of this tool was the venerable Django Developer Toolbar. We found that while this tool was helpful in debugging performance issues it had surprisingly low adoption and usage in Zenefits. We attributed this to not interacting well with AJAX calls and being too awkward / time consuming to map the SQL calls to the code that might have produced it. Instead engineers used more traditional CPU profiling tools, which while somewhat misaligned to DB N+1 was at least orientated around the code.
The tool we present here has a few advantages over either the SQL call list or CPU profilers:
- Shows stack trace of the SQL call
- Shows proposed solution (select_related/prefetch_related)
- Shows repeated SQL count, a line of code issuing the same SQL but with different parameters, often an indicator of a N+1 loop
- Shows exact SQL duplicate count, calls that would benefit from being cached or pulled out of the per-item loop
- Chrome developer tool plug-in, integrates the tooling in existing workflow and reduces friction
- Flame Graph visualisation, collects all the stack traces together to allow quickly identifying which area(s) of code is producing the load
Even though we expected developer UX to be important it was still surprising how impactful integrating with their existing workflow as a chrome developer tool plug-in and enabling it by default in our cloud development environment was. It doesn’t matter how useful a tool is if you don’t remember it at the time or delay applying it to the problem due to extra steps required. “At-handiness” and friction to apply the tool are critical for adoption.
Explicitly dealing with N+1 in the tool was a big win. We do this by matching a hash of the stack trace and SQL without the parameters as a “signature”. In more naive tools a N+1 will show as hundreds of individual SQL statements, but here it will be a single entry. This also lets us sort by impact i.e. we sort by number of repeats of the signature.
The second issue we had is with API or tasks that result in large number of SQL calls using a number of different services. There might be a single service that’s causing the bulk of the SQL but it’s hard to notice that there’s a common ancestor in the stack traces of the SQL calls. We implemented a “Flame Graph” visualisation to display the stack traces in a combined way. This different projection of the same data makes identifying the the areas of code to focus on easier and is complementary to the table listing.
Where can I get it?
There’s an open-source re-implementation of the [internal] Zenefits SQL query profiler available here: https://github.com/django-query-profiler/django-query-profiler.
It’s slightly different than the internal Zenefits version of the SQL query profiler described here and shown in the video above. It’s visual styling is improved, it supports newer versions of Django and it’s far easier to install on a normal Django set-up.
A larger difference is the internal Zenefits version also supports production profiling. This however makes it considerably more complex and has dependencies on features in our forked version of Django. Hopefully we’ll have a chance to talk about how this production tracing works in a follow-up article.
Authors line: Yash Maheshwari wrote the internal tool while at Zenefits and the open-source reimplementation some time later | Glynn Morrison supervised the internal project and contributed some code to the open-source project.