Don’t be Circular

Rodrigo Navarro
Akurey
Published in
5 min readFeb 18, 2020

When creating a database model design there are a lot of good practices we could follow in order to have a good design, but there are also bad practices in which we could fall if we are not careful. One of these bad practices is called circular references.

Just as in any programming language, when you add a reference in a first file to a second file and then on the second file you add a reference to the first file, you will get an error. Same goes for an SQL design, one type of circular reference in SQL is made when a first table references a second and that second table references the first table.

Here is a quick and simple example of how this would look in a model:

The figure shows tables DBC_States and DBC_Cities in which DBC_Cities has a reference to DBC_States as StateId and DBC_States has a reference to DBC_Cities as PrincipalCityId.

The principal problem on circular references is that when you start inserting data into the tables you won’t be able to since none of the references can accept nulls and you can’t insert one record without having another on the other table.

This model can be easily fixed by moving the PrincipalCity value to the DBC_Cities table:

The figure shows a DBC_States table and a DBC_Cities table with IsPrincialCity as a bit and StateId as a reference to DBC_States.

Now lets see a more complex design and identify a more complex type of circular reference:

The figure shows DBC_Countries, DBC_States, DBC_Cities and DBC_VisitedCities all linked on a 1 to N relation. Then DBC_Countries, DBC_Users and DBC_VisitedCities all linked on 1 to N relation. The DBC_Users and DBC_Countries show the nationality of that user.

An easy way to identify a circular reference is to start on a table which is serving as the primary table for two or more foreign keys, in this case, lets pick DBC_Countries. If we start going up we meet, DBC_Users and then DBC_VisitedCities, there the link ends, since DBC_VisitedCities does not serve as the primary key for any other table. Then we do the same process with the other link, start at DBC_Countries, go to DBC_States, DBC_Cities and end up in DBC_VisitedCities.

The figure shows the same model as before but with arrows pointing from DBC_Countries to DBC_VisistedCities using the two paths available.

Since there are two ways you can use to get from one table to another, there is a circular reference issue that could lead to inconsistencies on your data.

Fixing this one requires a little more thought than the first one we fixed:

The figure shows how to fix the previous model. Adding a table DBC_Nationalities between DBC_Countries and DBC_Users.

In this model, if you set on DBC_Countries and try to go to DBC_VisitedCities, the only way to go is using DBC_States to DBC_Cities and DBC_VisitedCities. The other route is cut off on DBC_Nationalities. This model eliminates the circular reference and allows our app to support multiple nationalities, if ever needed.

Don’t confuse a circular reference with a circle in a diagram

The concept of a circular reference can sometimes be confused with a diagram model that forms a circle, but as we saw in the example before, the model forms a circle, but there is no circular reference.

Circular references are bad, but sometimes needed

There are several database management systems that allow deferrable constraints, which would allow you to have circular references, since you can insert into one table, insert into the other and then update the constraints. They allow it because it isn’t always a bad thing, sometimes it’s the only way to implement something. Take the following example:

You need to build a system that allows a user to add levels which have a parent which is also a level, whatever a level is in that context.

First model that comes to mind could be something like this:

The figure shows a table DBC_Levels that has LevelId as primary key and ParentLevelId as a reference to itself.

And it’s ok, it has no circular reference since the ParentLevelId can be null. It also works, you can store the data of a level and each one has a parent which is also a level. Everything is good. Queries might get out of hand a little since you would have to use recursive queries but we will leave that for another post.

The problem now is that your app is growing and now as requirement you have to make it possible for all levels to have N parents. Now your model needs to evolve into something like this:

The figure shows two tables, DBC_Levels and DBC_LevelParents, the later with LevelId and ParentId as references to DBC_Levels.

Which has a circular reference, but there is no other way of modeling this graph like problem. Several database managers are aware of this issue and allow other ways of creating graph databases, but that is also out of the scope of this article.

Detecting circular references

The examples in this post are pretty straightforward and easy to detect, but real life isn’t as easy and you can easily create a model that has 50 or 100 tables, now detecting circular references isn’t as easy.

But not all is lost, you can code something that will let you know if there are circular references in your DB model, first you need to list all the foreign keys, then filter the ones that show more than once and find all possible paths to that key and if it has more than two paths to the same table then you’ve found an issue.

References

https://github.com/Wuodan/SQL-Find-Circular-References#code-explained

https://stackoverflow.com/questions/1006917/are-circular-references-acceptable-in-database

https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-sample?view=sql-server-ver15

--

--