SQL Hierarchical Queries without Recursion and Cursors (The life before CTE in SQL Server)

Sanish Abraham
8 min readAug 11, 2022

…And Why We Need That Now! (A Modern Use Case)

Remember those old times when we had to write scripts to find the ancestors or descendants of a particular node in a hierarchical tree structure stored within the RDBMS table. Recall that scripting circus we went through using cursors and/or recursive functions, which possibly triggered various side effects and performance issues.

In our case, as high traffic was expected in the application, we were totally against the use of cursors or recursions in our project back then. Still, we couldn’t find any online or offline solutions or any sort of suggestions from others for a better alternative. However, I had an intuition that it could be possible and decided to give it a try anyway. Eventually created a simple and elegant solution using a ‘while’ loop and a temporary table with identity column, which saved us a lot of time, effort and of course some extra CPU cycles.

After the introduction of Common Table Expression (CTE) which was released with Microsoft SQL Server 2005 version, a temporary named result set mechanism, the above-mentioned legacy solution became unnecessary. But surprisingly something happened recently in a project and I had to resort to the same classical solution once again. Actually, that inspired me to share the original script here after a decade and half. I guess it might be helpful for the people who are facing similar issues with corrupted hierarchical data.

Let’s start from the beginning with a multiple organizations tree structure diagram as below.

Let’s start from the beginning with a multiple organizations tree structure diagram as below.

A sample tree structure of multi-level organizations

The Pre-SQL Server 2005 era

For all those who started their career before SQL2005 mostly had nightmares about writing hierarchical queries? For instance, we had a table to manage multi-level organizations (or customers in a multi-level marketing application) where each row has parent organization id column which might be null for root organization rows. We were asked to write the script to fetch all organizations under a given organization id which was passed as the input parameter. What were the options?

Recursive functions and cursors

Using a recursive table function and/or cursor, we can iterate through all rows and load data to the table variable or temp table to use in subsequent queries. But that needs multiple iterations depending upon the total levels and also may need expensive cursors and unwanted locks on data. This add extra load on the processor and memory thereby causing significant execution delay. Another unpopular solution, the fallback to middle layer for data processing and returning to database, was also ruled out because of the obvious performance issues. So, what was the workaround then?

A better solution using a while loop and a table variable

Sounds impossible? Well not at all. There is a quite easy and simple trick.

Let’s go by above example:

Create a table to hold multi-level organization structure with parent child relationship as given in the above diagram where four root organizations and their descendant organizations were inserted with special naming pattern for readability. Check the GitHub repository given below to see the scripts in proper readable form.

Create/Insert Scripts:

Result:

Tree structure in RDBMS

The “identity” magic with @table variable inside a while loop

Assume the input is the org id for which we need to find all the child organizations up to the last level. The legacy script uses a while loop and a table variable (or temp table) with an identity column as below:

Script:

Result:

Listing all descendants of A1

The Logic:

The solution is simple, the table variable (or temporary table) is created with an identity column (1,1), and is pre-populated with the starting value which was passed as the param value. And when the child elements selected (based on the parent id same as the above first value) are inserted into the temp table, the identity column values created automatically in the order (seed value as 1 and increment by 1). Then we loop again by picking one by one using a manual counter which point to the current id in the temp table and next level child elements are being added and this continues till the counter crosses the max value of identity field value (when no more rows to process).

We can use temporary #table instead of @table variable. Even we can use global temporary ##table if needed. But make the decision wisely based on the project requirements as each one has its own cons and pros.

Post SQL Server 2005 Era (Introduction of CTE)

But from SQL Server 2005 onwards, with the introduction of CTE (Common Table Expression) we have a straight forward solution. It is quite easy now to find descendants and ancestors.

Result:

Listing descendants of A1 using CTE

Finding ancestors instead of descendants

If we need to find all the ancestors of a particular organization and not descendants, that can be easily achieved by a small change.
Just interchange Id and ParentId in the script using CTE.

Result:

Listing ancestors of A1–1–1–1

Notes:

The similar solution using the classic while loop like the one above is uploaded to the GitHub repository. Please download it from there in case you are interested.

[https://github.com/mailsanish/HierarchicalQueries]

Do we need the legacy script now?

Using CTE for hierarchical queries in latest SQL Server versions saves a few extra lines and also has a better readability. But if we need to work with pre-2005 versions, the old solution would be our best option.

But there is another scenario I stumbled upon in the recent past, where I could find the old method back in charge. It helped us to identify corrupted nodes in the organization tree structure. An organization was wrongly updated with a parent id which was actually the id of a child organization, that means, it ended up in cyclic dependency. Eventually, the stored procedure which uses CTE for selecting descendants of a given organization failed with the error message:

CTE failure when cyclic dependency exists

CTE failed with the above error but didn’t say where exactly it failed. That made it hard to identify which entry or entries caused the issue. The data in the table were huge and it was hard to identify by checking manually by following each organization ancestry. Hence, after weighing different options, I thought of getting help from my old savior by making some tweaks here and there. After that the while loop looked like below (deliberately used some subqueries for better readability):

Result:

To test the above script, change the parent of B1 from NULL to 12, which is the id of one of the descendants. And we will get this:

Displaying the row with invalid ParentId

Notes:

The above script was to identify the invalid entry while searching descendants. Similarly, changes were made to find invalid ancestors by tweaking the script in slightly different way. Please find the script for that (in case you need) in the GitHub repository. [https://github.com/mailsanish/HierarchicalQueries]

“The statement terminated. The maximum recursion 100 has been exhausted before statement completion.” — CTE failure message does not reveal where the circular reference exists. But the updated legacy script can be used to identify the corrupted row that caused the cyclic dependency.

Hierarchical Queries in Oracle

Similar hierarchical queries can be created in Oracle by using CONNECT BY PRIOR and START WITH keywords. Comparing with SQL Server, we may need a temporary table to populate the result set to use in subsequent queries where in SQL Server CTE can be used directly as result set.

Oracle Script:

Notes:

Script to find the ancestors in Oracle DB is uploaded to the GitHub repository given below. Also, the scripts to create table and insert demo data are available there.

[https://github.com/mailsanish/HierarchicalQueries]

Other Databases

Similar solution exists in MySQL and other related databases. And incase if there is no CTE like solutions available, we can use the above-mentioned classic technique by simply using a basic while loop and temporary table and it will do the magic for us.

Final Words

After 18 years now, it may look like a low-level solution, however back then it was a life saver for all of us which we used in various projects. And as I mentioned just above, it is really a handy tool to easily identify the corrupted data in a hierarchical data while CTE simply fails with an error. Moreover, all we need are just a while loop and a temporary table with an identity column. At least it saved us weeks of labor. Smart work over hard work :)

Please feel free to share your thoughts.

Links

Complete scripts available on the Git Repository.

GitHub: https://github.com/mailsanish/HierarchicalQueries

Medium: https://medium.com/@sanish.abraham/sql-hierarchical-queries-without-recursion-and-cursors-the-life-before-cte-in-sql-server-bf7745e382de

LinkedIn: https://www.linkedin.com/in/sanishabraham/

Keywords:

Hierarchical Query without CTE, SQLServer, Oracle, Hierarchical Queries, Without Cursor and Recursion, While loop, Temporary table, Table variable, Identity column, CTE, Finding Circular Reference, Cyclic Dependency, Cyclic Complexity, Finding invalid data in parent child relationship table, Multi-level table structure

HashTags:

#Hierarchical_Query_without_CTE #Hierarchical_Queries #sqlserver #hierarchy #ctesp #commontableexpressions #nocte #multileveltables #Non_CTE #HierarchicalQueries #sql #organization #Without_Cursor_Recursion #Hierarchical_Queries_With_While_And_Temp_Table #Circular_Reference #Cyclic_Dependency #reference #database #dataarchitect #data #script #tsql #parentchildrelationship #ctesp #noncte #sql #sqlserver7 #sqlserver2000 #scripting #oracle #connectbyprior #cyclicdependency #circularreference

--

--

Sanish Abraham

Solution Architect | Technical Consultant | Creative Thinker