Modeling Status Updates on PostgreSQL using JSON

Image for post
Image for post

Many applications require records to keep a status (‘complete’, ‘cancelled’, ‘in progress’, etc). Usually, there is also an associated traceability constraint: we need to know when the status was modified and who modified it.

Traditionally, this would be accomplished using 2 tables: a “base table” for the object’s data and a “status table” for status updates. However, taking advantage of PostgreSQL’s support of JSON, there is a better alternative.

The Traditional Route

Let’s try to model an example using the “2 tables” approach.

We can imagine a ticket system with tables like these:

Image for post
Image for post
FROM status_updates
WHERE ticket_id = 99;

With a very simple query, we are able to get the complete ticket’s history.

  SELECT status 
FROM status_updates
WHERE ticket_id = 99
ORDER BY created_at DESC LIMIT 1;

Getting a ticket’s current status requires sorting through all status_updates with a matching ticket_id and choosing the most recent one.

This query is not too hard to figure out, and as long as we keep an index on ticket_id, it will perform well.

WITH tickets_with_status AS (
ticket_id, status
FROM status_updates
ORDER BY ticket_id, created_at DESC
FROM tickets_with_status
WHERE status = 'OPEN';

This is where things get a bit more complicated. The problem is that we need to execute 2 separate steps:

  1. Figure out each ticket’s current status by looking at all status_updates and choosing the most recent one .
  2. Filter based on the status we’re looking for, ie: OPEN.

Not only is this query a bit harder to figure out, it will also perform badly, since there is no way to index it.

The JSON route

Taking advantage of PostgreSQL’s support for JSON, we can model the same ticket system using a single table.

Image for post
Image for post
CREATE TABLE tickets (
description TEXT,
assignee INTEGER REFERENCES users(id),
status_updates JSONB NOT NULL,

Notice how the column status_updates has type JSONB (JSONB is a more efficient version of JSON).

There are several ways to create JSON objects in PostgreSQL. The easiest way is to construct a string and then cast it to JSON using ::json.

'[{"status": "OPEN", "author": 3}]'::json

However, since we need a dynamic value in our JSON, ie: NOW(), we’re going to use the slightly more verbose json_build_array and json_build_object.

INSERT INTO tickets (
description, reporter, status_updates, created_at
'Something went wrong.',
'status', 'OPEN',
'author', 3,
'created_at', NOW()

The status_updates value for our new ticket should look like this:

"status": "OPEN",
"author": 3,
"created_at": "2017-02-08T00:10:15.288522-05:00"
UPDATE tickets
SET status_updates = jsonb_build_array(
'status', 'CLOSED',
'author', 2,
'created_at', NOW()
) || status_updates
WHERE id = 1;

Notice the use of PostgreSQL’s concatenation operator || to insert a new value to the start of the existing array.

After this update, the status_updates value for our new ticket should look like this:

"status": "CLOSED",
"author": 2,
"created_at": "2017-02-08T00:40:09.926089-05:00"
"status": "OPEN",
"author": 3,
"created_at": "2017-02-08T00:39:56.437954-05:00"
SELECT status_updates FROM tickets WHERE id = 99;

Since the complete history is in the status_updates column, fetching it is pretty straightforward.

If you follow the rule of always inserting new status updates to the start of the status_updates array, you can use this simple query:

SELECT status_updates->0->>'status'
FROM tickets
WHERE id = 99;
FROM tickets
WHERE status_updates->0->>'status' = 'OPEN';

This is where JSON shines. Not only is this query super easy to write, it is also easy to index (PostgreSQL support indexes on expressions).

CREATE INDEX current_status_idx
ON tickets ((status_updates->0->>'status'));

Constraints allow us to prevent data corruption and keep our data clean. JSON data types should not be exempted from using them.

Since we’re expecting 3 values inside each status update (status, author, and created_at), let’s add some constraints for these fields.

ALTER TABLE tickets-- make sure status has a valid string
status_updates->0->'status' ?| ARRAY[
], false
-- make sure author is a number
jsonb_typeof(status_updates->0->'author') = 'number',
-- make sure created_at is a timestamptz
ADD CONSTRAINT created_at_is_timestamptz CHECK(COALESCE(
status_updates->0->>'created_at' ~

Also, you might have noticed there is a big IF in the above queries:

If you follow the rule of always inserting new status updates to the start of the status_updates array …

As long as you insert using SET status_updates = <new_status> || status_updates this condition will hold. However, let’s just add one more check in case we accidentally try inserting to the end of the array.

-- check the array has only one element, or
jsonb_array_length(status_updates) = 1 OR COALESCE(
-- check the N-1 element has a more recent created_at than the N
-- element.
-- Since dates are compared as the amount of time passed since
-- epoch, a more recent date is 'bigger' than an older date
(status_updates -> jsonb_array_length(status_updates) - 2 ->>
(status_updates -> jsonb_array_length(status_updates) - 1 ->>
'created_at' )::timestamptz
, false)

If you are willing to look past the 1NF theory, you can take advantage of the simplicity and performance boosts that PostgreSQL’s JSON types can offer.

Status updates are a very good fit for JSON types since they’re rarely useful outside the context of the original object.

Can you think of other scenarios where JSON types are useful? Have you solved the “status updates” problem in a different way? Let me know in the comments.

Written by

Full-Stack Web Developer @ X-Team / Digital Nomad. Se habla español.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store