Recursion in SQL Explained Visually

Denis Lukichev
Nov 22, 2020 · 5 min read

Recursion in SQL? But why? Oh, there are many uses for that. It’s common to store hierarchical data in SQL and recursive queries are a convenient way to extract information from such graphs. Organizational structure, application menu structure, a set of tasks with sub-tasks in the project, links between web pages, breakdown of an equipment module into parts and sub-parts are examples of the hierarchical data. The post will not go into great details of those many use cases rather look at two toy examples to understand the concept - the simplest possible case of recursion on numbers and querying data from the family tree.

Let's think about queries as a function. In a sense that a function takes an input and produces an output. Queries operate on relations or one could say tables. We will denote those as Rn. Here is a picture of a query. It takes three relations R1, R2, R3 and produces an output R. Simple enough.

SQL example: SELECT <something> FROM R1, R2, R3 WHERE <condition>

Query can take something and produce nothing:

SQL example: SELECT <something> FROM R1 WHERE 1 = 2

Take nothing and produce something:

SQL example: SELECT 1

Or take nothing and produce nothing

SELECT 1 WHERE 1 = 2

Recursion is achieved by WITH statement, in SQL jargon called Common Table Expression (CTE). It allows to name the result and reference it within other queries sometime later.

Here is a sample.

WITH R AS (SELECT 1 AS n)
SELECT n + 1 FROM R

Query (SELECT 1 AS n) now have a name — R. We refer to that name in SELECT n + 1 FROM R. Here R is a single row, single column table containing number 1. The result of the whole expression is number 2.

The recursive version of WITH statement references to itself while computing output.

For the recursion to work we need to start with something and decide when the recursion should stop. To achieve this, usually recursive with statement has following form.

Important to note that base query doesn’t involve R, but recursive query references R. From the first look it seems like infinite loop, to compute R we need compute R. But here is a catch. R actually don’t reference itself, it just references previous result and when previous result is empty table, recursion stops. Actually it could help to thing of it as an iteration rather then recursion!

Here’s what is happening: base query executed first, taking whatever it needs to compute the result R0. Second recursive query is executed taking R0 as input, that is R references R0 in the recursive query when first executed. Recursive query produces the result R1 and that is what R will reference to at the next invocation. And so on until recursive query returns empty result. At that point all intermediate results are combined together.

Quite abstract now. Lets take a concrete example, count until 3.

Base query returns number 1 , recursive query takes it under the countUp name and produces number 2, which is the input for the next recursive call. When recursive query returns empty table (n >= 3), the results from the calls are stacked together.

Watch out, counting up like that can only go that far. There is a limit for recursion. It defaults to 100, but could be extended with MAXRECURSION option (MS SQL Server specific). Practically, it could be a bad idea to crank recursion limit up. Graphs might have cycles and limited recursion depth can be a good defense mechanism to stop poorly behaving query.

OPTION (MAXRECURSION 200)

Here’s another example, find ancestors of a person:

Base query finds Frank’s parent — Mary, recursive query takes this result under the Ancestor name and finds parents of Mary, which are Dave and Eve and this continues until we can’t find any parents anymore.

Now this tree traversal query could be the basis to augment the query with some other information of interest. For example, having a birth year in the table we can calculate how old the parent was when the child was born. Next query do exactly that, together with showing lineages. To do that it traverses the tree from top to bottom. parentAge is zero in the first row because we don’t know when Alice was born from the data we have.

Take away — recursive query references the result of base query or previous invocation of recursive query. Chain stops when recursive query returns empty table.

In the follow-up post we’ll take an algebraic view on SQL recursion and will look into recursive stored procedures.

[UPDATE]. Post updated with comments from kagato87 and GuybrushFourpwood reddit users.

The Startup

Get smarter at building your thing. Join The Startup’s +793K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Denis Lukichev

Written by

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store