A declarative way to cache PostgreSQL queries using Node.js

Gajus Kuizinas
Nov 2, 2019 · 4 min read

There can be many reasons to make you want to cache SQL queries. Some of them are valid, e.g. reducing the number of round-trips (esp. when dealing with high-latency). Others might be micro-optimizations that are just not worth it. Regardless of your reasons for wanting to cache SQL queries, implementing them can be cumbersome.

Image for post
Image for post

Subject

I am going to use Slonik (PostgreSQL client for Node.js) and node-cache to demonstrate the usual way to implement cache and a declarative way to add cache to your existing codebase.

Let’s assume a simple query-method to get PK value using another unique identifier:

const getCountryIdByCodeAlpha2 = (
connection: DatabaseConnectionType,
countryCode: string
): Promise<DatabaseRecordIdType> => {
return connection.maybeOneFirst(sql`
SELECT id
FROM country
WHERE code_alpha_2 = ${countryCode}
`);
};

This type of query is particularly common when ingesting data from external inputs (e.g. user submitted input or data that has been collected using scraping).

Measuring the problem

In the particular case that prompted me to explore caching, this query was called 7k+ times/ minute. Aside from this query, there were a dozen of other similar queries that collectively were executed well over 50k+ times/ minute. None of them affect my database server performance (PostgreSQL is already good at caching), but they:

  • generate unnecessary logs
  • increase the overall time needed to complete the task

The time it takes for PostgreSQL to execute such a query is minuscule, e.g.

EXPLAIN ANALYZE
SELECT id
FROM country
WHERE code_alpha_2 = 'gb';
Index Only Scan using country_code_alpha_2_id_idx on country (cost=0.14..1.16 rows=1 width=4) (actual time=0.425..0.426 rows=1 loops=1)
Index Cond: (code_alpha_2 = 'gb'::citext)
Heap Fetches: 0
Planning Time: 0.069 ms
Execution Time: 0.439 ms

However, we have to also add the network time. In my case, the latency between the worker agent and the database is ~3ms.

ping ***.aivencloud.com                                                     17:31:54
PING ***.aivencloud.com (34.90.***.***): 56 data bytes
64 bytes from 34.90.***.***: icmp_seq=0 ttl=53 time=3.166 ms
64 bytes from 34.90.***.***: icmp_seq=1 ttl=53 time=2.627 ms
64 bytes from 34.90.***.***: icmp_seq=2 ttl=53 time=2.873 ms

That means that executing a query and getting the result takes at least 7.5ms (0.5ms query execution time + 2 trips). Put it another way, every 60 seconds, we waste ~350 seconds of computing time (spread across many servers). Overtime, this adds up to a lot (70 hours over month).

Implementing cache

All you need to implement cache is some storage service with a mechanism to limit how long and how many items can be stored. is such an abstraction for synchronously storing/ retrieving objects in memory. Using , you use method to store cache and method to retrieve cache; handles invalidation and storage limits behind the scenes. This is how would look like if it used :

const cache = new NodeCache({
checkperiod: 60,
maxKeys: 10000,
stdTTL: 60,
useClones: false,
});
const getCountryIdByCodeAlpha2 = async (
cache: NodeCache,
connection: DatabaseConnectionType,
countryCode: string
): Promise<DatabaseRecordIdType> => {
const maybeCountryId = cache.get(countryCode);
if (maybeCountryId) {
return maybeCountryId;
}
const maybeResult = await connection.maybeOneFirst(sql`
SELECT id
FROM country
WHERE code_alpha_2 = ${countryCode}
`);
cache.set(maybeResult, maybeResult); return maybeResult;
};

However, this way of adding cache has a few disadvantages:

  • It introduces a lot of boilerplate around every query.
  • It introduces an additional dependency ( instance) that needs to be passed around throughout your codebase along with the database connection handle.

If you had to go this way, 9/10 I would say it is not worth it. Luckily, there is a better way.

Declarative cache

Slonik has a concept of interceptors (middlewares) that can be used to capture and modify SQL request and response. This makes them perfect for implementing cache. Such interceptor already exists: .

uses SQL comments to recognize which queries should be cached and for how long. Specifically, it searches for comment . comment indicates for how long the query should be cached. Queries without are not cached at all, i.e. In order to cache the result of the earlier query for 60 seconds, the only change we need to make is to add a comment to our query:

const getCountryIdByCodeAlpha2 = (
connection: DatabaseConnectionType,
countryCode: string
): Promise<DatabaseRecordIdType> => {
return connection.maybeOneFirst(sql`
-- @cache-ttl 60
SELECT id
FROM country
WHERE code_alpha_2 = ${countryCode}
`);
};

Now this query will be cache result for each unique for 60 seconds.

does not implement storage, though. You can use , , Redis, or any other storage engine. To use them, you simply need to abstract their interface using and methods, and provide them to . Continuing with the example, this is how you would initiate Slonik with the query cache interceptor using as a storage engine:

import NodeCache from 'node-cache';
import {
createPool
} from 'slonik';
import {
createQueryCacheInterceptor
} from 'slonik-interceptor-query-cache';
const nodeCache = new NodeCache({
checkperiod: 60,
stdTTL: 60,
useClones: false,
});
const hashQuery = (query: QueryType): string => {
return JSON.stringify(query);
};
const pool = createPool('postgres://', {
interceptors: [
createQueryCacheInterceptor({
storage: {
get: (query) => {
return cache.get(hashQuery(query)) || null;
},
set: (query, cacheAttributes, queryResult) => {
cache.set(hashQuery(query), queryResult, cacheAttributes.ttl);
},
},
}),
]
});

and that is it: with minimal code changes, now you can cache any query just by adding a comment to SQL. Among other benefits, this:

  • allows you to quickly test the impact of caching a specific query
  • allows you to quickly enable/ disable query caching (by simply adding/ removing the query cache interceptor)
  • does not affect how you write test cases
  • does not add boilerplate code to every query
  • does not require to passthrough an additional dependency to every query invocation

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Gajus Kuizinas

Written by

Software architect, startup adviser. Editor of https://medium.com/applaudience. Founder of https://go2cinema.com.

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Gajus Kuizinas

Written by

Software architect, startup adviser. Editor of https://medium.com/applaudience. Founder of https://go2cinema.com.

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store