Estimated counts for faster Django Admin change_list

Ketan Bhatt
Apr 19, 2017 · 3 min read

Short story of how we reduced the response time of some of our admin pages by 1000x.

The Problem

One of our tables grew to over 25 million rows. Which, while not a large number, makes the most frequent queries slow.

A direct effect it had was on our Admin panel. The change_list form for the model started taking ages to load. Our Operations team uses the Admin 24*7 for most of what we do, we could see them grumbling about how slow things have become as they had to sit for 2–3 seconds minimum before the page would load completely. This lead to low efficiency for at least 10 people every day. This had to be fixed.

We had already done stuff like using select_related or prefetch_related on the related items to reduce the number of queries, what else could we do? On some investigation (Thank you DDT), we found out that a certain count query was taking 98% of the total time taken by SQL queries on that page. It was clear that this count query would be painfully slow for large tables. Why was this query being made? For the admin’s paginator to work.

The Solution

It was clear that the pagination logic will have to change. Our requirement with the admin was just to give an estimate of the number of rows that were there, and be able to navigate from one page to the another. For that, we would need a faster way to get the count of rows in a table.

We are using PostgreSQL and it has wonderful docs for things like how to estimate counts. PostgreSQL maintains metadata about the database in system catalogs. One such catalog is the pg_class. On every VACUUM, ANALYZE, or for commands like CREATE INDEX, postgres updates the relevant rows in pg_class. So instead of getting count like:

SELECT COUNT(*) AS "__count" FROM "my_table"

We could get a rough estimate by doing:

SELECT reltuples FROM pg_class WHERE relname = 'my_table'

This query takes no more than 1 ms to complete!

To integrate it with Django’s ModelAdmin seamlessly, we created a custom Paginator:

And we are sorted. Lightening fast page loads are back, and our Operations team is happy.

UPDATE: The code has been updated to be compatible with Django ≤ 1.11

The Future

You can extend the work done here to make a more generic library:

  1. By making a queryset method that returns the approximate count. This method can then be used anywhere you just need an estimate of the count.
  2. By checking for connection.vendor before estimating, and using database specific queries.

and, The Gotchas

The custom paginator is not suitable with tables having small number of rows. This is because:

  1. Sometimes the admin will not show any rows for a given model. That will be because the reltuples for that table in pg_class returned 0. This happens with small tables which have never been analysed/vacuumed. (Running an ANALYSE for these tables will fix these, but read point 2)
  2. They generally get analysed/vacuumed with lesser frequency and the percentage deviation from the actual values will be higher here.

Other than that, this just works with PostgreSQL (but there should be similar ways to estimate counts in other databases as well).

SquadStack Engineering

Stories from the people who build SquadStack

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store