Dynamically generating SQL queries using Node.js

Gajus Kuizinas
Aug 30, 2019 · 5 min read

Ever since I have released Slonik (PostgreSQL client for Node.js) and written a controversial Stop using Knex.js article (tl;dr; query builders are designed to be building blocks for ORMs; they do not add value when majority of the query is static.), I have been asked a lot — then how do I generate dynamic queries? I will answer this by sharing a couple of real-life examples.

All of the queries in this article are actual queries used in real-life business, Applaudience which heavily relies on PostgreSQL.

Disclaimer: (1) All examples discuss only SQL injection threats. Authorization logic (e.g. whitelisting columns user is authorized to access) is not in the scope of this article. (2) All statements assume there are no bugs in Slonik implementation.

Image for post
Image for post
This article can be also read (with better syntax highlighting) on dev.to.

Static query with dynamic value bindings

sql`
SELECT c1.country_id
FROM cinema_movie_name cmn1
INNER JOIN cinema c1 ON c1.id = cmn1.cinema_id
WHERE cmn1.id = ${cinemaMovieNameId}
`;

If you are using Slonik, it is safe to pass values as template literal placeholders. sql will interpret all placeholder tokens and construct final SQL query. In this case, the only dynamic part of the query is the value bindings themselves, therefore the final query is:

SELECT c1.country_id
FROM cinema_movie_name cmn1
INNER JOIN cinema c1 ON c1.id = cmn1.cinema_id
WHERE cmn1.id = $1

Query and bound values will be sent to PostgreSQL separately: no risk of SQL injection.

Binding a list of values

sql`
SELECT m1.*
FROM movie m1
WHERE m1.id IN (${sql.valueList(movieIds)})
`;

This will generate a query with dynamic set of value bindings, i.e. if movieIds is [1, 2, 3] the query that is sent to PostgreSQL will be:

SELECT m1.*
FROM movie m1
WHERE m1.id IN ($1, $2, $3)

However, despite this being a common pattern, I do not advise to use this pattern. Instead, use sql.array, e.g.

sql`
SELECT m1.*
FROM movie m1
WHERE m1.id = ANY(${sql.array(movieIds, 'int4')})
`;

This will generate a fixed-length query that does not change based on its inputs, i.e.

SELECT m1.*
FROM movie m1
WHERE m1.id = ANY($1::"int4"[])

Continue reading sql.array vs sql.valueList.

Query with dynamic columns

(Note: Not an actual query used in business. See next paragraph.)

sql`
SELECT m1.id, ${sql.identifier(['m1', movieTableColumnName])}
FROM movie m1
WHERE
m1.id = ${moveId}
`;

This query will produce a query that selects exactly 1 dynamically identified column. There is no risk of SQL injection, i.e. even if logic leading to generation of movieTableColumnName was somehow compromised, the worst that can happen is that query attacker will be able to return any column under m1 alias or execute query with invalid column identifier values (both carry risk; business logic is not in scope of this article).

Just because you can do this, you probably shouldn’t. When your application requires to return different columns depending on user’s query, it is better to select all columns that are in scope of the business logic and pick value of the needed column, i.e. If the intent of the latter query was to return a different movie identifier based on movieTableColumnName, then it is better to write a static query:

sql`
SELECT
m1.id,
m1.foreign_comscore_id,
m1.foreign_imdb_id,
m1.foreign_metacritic_id
m1.foreign_rottentomatoes_id,
m1.foreign_tmdb_id,
m1.foreign_webedia_id
FROM movie m1
WHERE
m1.id = ${moveId}
`;

The latter has does return some superfluous data on every query, but it has several advantages:

  1. It reduces risk of SQL injection (regardless of how much you trust code generation logic, static code is always safer than dynamic code).
  2. It produces only one entry pg_stat_statements. You will learn to appreciate as few as possible queries in pg_stat_statements as your application scales.

Query with multiple dynamic columns

Nesting dynamic SQL queries

(Note: Simplified version of an actual query used in business.)

const futureEventEventChangeSqlToken = sql`
SELECT
ec1.event_id,
ec1.seat_count,
ec1.seat_sold_count
FROM event_change_future_event_view ec1
`;
sql`
SELECT
event_id,
seat_count,
seat_sold_count
FROM (
${futureEventEventChangeSqlToken}
) AS haystack
WHERE ${paginatedWhereSqlToken}
ORDER BY ${orderSqlToken}
LIMIT ${limitSqlToken}
`

This allows to pass pre-bound SQL queries as first-class citizens across your program. This is handy when the intent is to isolate SQL generation logic for testing or when large SQL fragments are shared between queries or when the intent is to simply reduce concentration of code complexity in one place.

Injecting dynamic SQL fragments

sql`
SELECT ${sql.raw('foo bar baz')}
`

translates to (invalid) query:

SELECT foo bar baz

Unlike the previous example using sql tagged template, sql.raw is not safe – it allows to create dynamic SQL using user input.

There are no known use cases for generating queries using sql.raw that aren't covered by nesting bound sql expressions (described in "Nesting dynamic SQL queries") or by one of the other existing query building methods. sql.raw exists as a mechanism to execute externally stored static(e.g. queries stored in files).

Query with a dynamic comparison predicate members or operator

(Note: Not an actual query used in business.)

sql`
SELECT
c1.id,
c1.nid,
c1.name
FROM cinema c1
WHERE
${sql.comparisonPredicate(
sql`c1.name`,
nameComparisonOperator,
nameComparisonValue
)}
`;

nameComparisonOperator can be values such as =, >, <, etc. Assuming nameComparisonOperator is "=", then the resulting query is going to be:

SELECT
c1.id,
c1.nid,
c1.name
FROM cinema c1
WHERE
c1.name = $1

The latter is an extremely rare use case, reserved almost entirely to building higher level SQL abstraction tools (such as ORMs). It may be useful for “advance search” scenarios, however continue reading to familiarise with alternative patterns (see sql.booleanExpression).

Query with dynamic WHERE clause members

const findCinemas = (root, parameters, context) => {
const booleanExpressions = [
sql`TRUE`,
];
if (parameters.input.query) {
const query = parameters.input.query;
if (query.countryId !== undefined) {
booleanExpressions.push(
sql`c2.id = ${query.countryId}`
);
}
if (query.nid !== undefined) {
booleanExpressions.push(
sql`c1.nid % ${query.nid}`
);
}
if (query.name !== undefined) {
booleanExpressions.push(
sql`c1.name % ${query.name}`
);
}
}
const whereSqlToken = sql.booleanExpression(
booleanExpressions,
'AND'
);
return context.pool.any(sql`
SELECT
c1.id,
c1.nid,
c1.name,
c2.code_alpha_2 country_code,
c2.name country_name
FROM cinema c1
INNER JOIN country c2 ON c2.id = c1.country_id
WHERE ${whereSqlToken}
`);
},

findCinemas is an implementation of a GraphQL resolver. WHERE clause of the query is constructed using a combination of 3 possible boolean expressions. As is the case with all the other query building methods in Slonik, all expressions can be nested: you can have other boolean expressions as members of a boolean expression or even SQL expression constructed using sql tagged template literal.

Summary

Finally, if I have missed a use case scenario that you would like me to cover, mention it in the comments and I will happily include it.

JavaScript In Plain English

New JavaScript + Web Development articles every day.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store