Using temporary tables in PostgreSQL with SQLAlchemy

Granular Engineering
Granular Engineering
5 min readAug 22, 2018

By Warren Henning

In a previous post, I talked about how we use SQLAlchemy, our ORM of choice, to create composable, maintainable business intelligence queries. This post discusses how SQLAlchemy also helps us keep those complex queries performant, and includes pointers to commonly used tools and resources we’ve found helpful here at Granular.

Application performance is a large topic and covers every layer of the application stack a request goes through, from the UI of the customer’s device (phone, tablet, or computer) down to the hardware of servers and data centers. Here I’m only going to discuss a particular slice of that stack: SQL queries running against PostgreSQL.

The first step I go through when trying to make a feature like our Financial Plans report faster is profiling code to find points of inefficiency. For this feature, the queries are always the bottleneck. This can be determined through any number of profiling tools; pick your favorite tool or technique. Once we’ve found a slow query call, we will want to EXPLAIN ANALYZE its performance, which will show the database’s imperative execution plan of our declarative SQL query. Options abound for doing this, of course. One way is to write a bit of temporary debugging code that compiles the query and prints the query plan straight to the console:

The next step is to analyze and act upon insights gleaned from this query plan. This can be challenging and is something I’m still mastering, but there are some tools that can help: pgAdmin can generate a nice graphical visualization of the query plan; explain.depesz.com can highlight problem areas of the query plan and help you understand where the bottleneck of the query is occuring. Explain.depesz is particularly helpful for more complex queries where there’s lots of timing information in a plan and only some of it is relevant. One thing to also note is that the site apparently expects the plan you enter to begin with:

Or something similar, or else you’ll get an unhelpful server error with no information as to why your seemingly valid query plan, minus header, couldn’t be parsed.

One optimization technique we’ve started utilizing on a limited basis is temporary tables. A complex set of queries can start by pulling the shared part of each query into a temporary table, indexing key columns, and then join against that for significant efficiency gains. Of course, whether this is effective needs to be validated with empirical data, but I was able to significantly decrease the load time of our Financial Plans report using this and other similar techniques.

In order to reduce the boilerplate associated with creating these temporary tables, our platform lead, Bborie Park, developed a small SQLAlchemy extension that makes creating and querying against temporary tables as easy as appending a function call to an existing query:

And this is where the power of SQLAlchemy comes in: we can easily use its extensible characteristics to take advantage of advanced features of our chosen database which the ORM does not necessarily have built-in support for without polluting our implementation with these details. temp_table() is less than 100 SLOC and was written in a few hours, but makes working with temporary tables much, much easier.

This functionality creates opportunities for reusing complex queries that can then be easily consumed elsewhere in the application. The developer working on that separate feature can consume the temporary table like any normal model, so the complexity contained within it is somewhat encapsulated and abstracted away. This allows for division of labor and better code reuse in the application.

We will continue looking for ways to deliver value to our customers through performance gains. It’s always challenging because we’re a relatively small team and are trying to rapidly build out a full-featured product, but with tools like SQLAlchemy and PostgreSQL, the job becomes easier.

A simplified version of the code follows. I’ve removed references to other internal code not really relevant to this post, like our soft delete and ACL filtering, and because of its current coupling to internal code, this might not work, but it should convey the general idea. This code is made available under the MIT license. If there’s interest, a more polished, packaged version could happen.

--

--

Granular Engineering
Granular Engineering

A place where Granular software developers talk about software. Granular is changing the future of farming by helping farms become more valuable businesses.