Reading Postgres query plans for beginners

Photo by rawpixel

Like me, you’re a Postgres noob. You’re a software dev (and a good one!), but you have to write database queries every now and then, and often they’re pretty slow. There’s Alex, the DBA — you can ask them for help. But even though they’re really polite, you know that there are things they’d rather be doing. And even if you can’t do the whole thing yourself, you’d like to be able to go to Alex with at least some understanding of the problem.

This is for you.

Postgres has some pretty great tooling around understanding slow queries — you can just ask the database to tell you what’s taking it so long. You do this by running an EXPLAIN command. So if you want to understand your query to find people called “Alice”, you’d run:

EXPLAIN (FORMAT JSON, ANALYZE, BUFFERS)
SELECT first_name, last_name
FROM people
WHERE first_name = 'Alice';

Hold on, let me EXPLAIN!

Okay, let’s break that down.

Lines 2–4 are just any (DML) Postgres query — they’re the code we want to improve.

EXPLAIN is the command we’re running here. It returns the query plan, which is the database’s way of describing what it’s going to do. If you just run EXPLAIN on the query, without all the options in brackets, you’ll get some text output like this:

Seq Scan on people(cost=0.00..10.88 rows=2 width=1032)
Filter: ((first_name)::text = 'Alice'::text)

I personally find this format really hard to read! I’d recommend always requesting the output formatted as JSON (or YAML, or XML at a push, depending on your preference).

If you run EXPLAIN (FORMAT JSON) on the query, you should see something like:

[
{
"Plan": {
"Node Type": "Seq Scan",
"Parallel Aware": false,
"Relation Name": "people",
"Alias": "people",
"Filter": "((first_name)::text = 'Alice'::text)",
"Startup Cost": 0.00,
"Total Cost": 16.49,
"Plan Rows": 2,
"Plan Width": 1032
}
}
]

We’re not going to go into too much detail on all of these fields now — we’ll deal with a lot of them later on. The important things to notice for now are that there are some details about the operation taking place — its type (a sequential scan), the filter being used, and the database object it’s operating on.

This is a really simple example — a query with one operation. Most of the queries you write will consist of several operations. Here’s a query plan with a whole bunch of different operations, nested and stacked in exciting ways, which was generated from an example on the excellent postgresqltutorial.com.

Going back to the original example, we’ve talked about the bits that explain what the database will do when it executes your query, but there are four extra fields that start to give you a bit of an idea about how the different operations will perform. The “Startup Cost” and “Total Cost” values will give you a rough idea of the relative time (in arbitrary units) spent on each operation, while the “Plan rows” and “Plan Width” will tell you the expected rows, and the size of each row (in bytes).

The important thing to remember about all these values, though, is that they’re estimates. The query planner has access to a bunch of information about your database that it uses to work out how long each operations will take and which approach to choose. Often, that’s pretty accurate, particularly for smaller datasets. But it’s definitely useful to be able to get a more realistic picture.

ANALYZE this

By adding the ANALYZE option to the EXPLAIN command, we start to get some more concrete information. The database will actually execute your query (so any side-effects will occur!), and give you some data about your query’s real-life performance — timings (in ms), row counts, and number of loops/applications of the step. EXPLAIN(OUTPUT JSON, ANALYZE) on our example query gives us:

[
{
"Plan": {
"Node Type": "Seq Scan",
"Parallel Aware": false,
"Relation Name": "people",
"Alias": "people",
"Filter": "((first_name)::text = 'Alice'::text)",
"Startup Cost": 0.00,
"Total Cost": 16.49,
"Plan Rows": 2,
"Plan Width": 1032,
"Actual Startup Time": 0.099,
"Actual Total Time": 0.357,
"Actual Rows": 2,
"Actual Loops": 1,
"Rows Removed by Filter": 2684

},
"Planning Time": 0.108,
"Execution Time": 0.381,
"Triggers": []

}
]

Needless to say, knowing the actual time taken and rows produced for each operation can be invaluable, particularly for cases where the query planner has made some bad decisions based on incomplete or out of date information. You can read more about row count estimates here.

Hitting the BUFFERS

By adding the BUFFERS option, we get some information about cache hits/misses that can tell you a lot about the amount of time spent on I/O. EXPLAIN(OUTPUT JSON, ANALYZE, BUFFERS) for our example query returns

[
{
"Plan": {
"Node Type": "Seq Scan",
"Parallel Aware": false,
"Relation Name": "people",
"Alias": "people",
"Filter": "((first_name)::text = 'Alice'::text)",
"Startup Cost": 0.00,
"Total Cost": 16.49,
"Plan Rows": 2,
"Plan Width": 1032,
"Actual Startup Time": 0.026,
"Actual Total Time": 0.106,
"Actual Rows": 2,
"Actual Loops": 1,
"Rows Removed by Filter": 2684,
"Shared Hit Blocks": 9,
"Shared Read Blocks": 0,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 0,
"Temp Written Blocks": 0

},
"Planning Time": 0.120,
"Execution Time": 0.125,
"Triggers": []
}
]
Photo by Ankush Minda

This can be useful when you’re trying to understand cache usage and make sure as much as possible happens in RAM, but for now we’re just going to point out it exists.

If you want to ask a pro for help (for example your DBA Alex, or the knowledgeable and friendly Postgres performance mailing lists), then it’s one of the things you should include to be a good citizen rather than an annoying noob — along with making sure you’ve done your best to understand the problem yourself.