DATA SCIENCE

How to Understand Long and Complex SQL Queries

A “peel-and-understand-the-layer” strategy for complex SQL queries

Naser Tamimi
CodeX
Published in
3 min readFeb 2, 2022

--

Photo by Clark Van Der Beken on Unsplash

A single SQL query for an ad hoc analysis or reporting could vary between 2–40 lines (typically). But when it comes to data pipelines and scheduled queries, a single SQL query could be hundreds of lines (easily! believe me).

A single SQL query with hundreds of lines, tens of CTEs, and multiple JOINS of different kinds could be intimidating. Instead of losing your self-confidence in this situation, you can follow my “How to peel a SQL query” recipe.

STEP 1) Big Picture First!

No one can understand a long SQL at first glance. Even the most experienced data engineers need time to digest a long and complex SQL query. Therefore, don’t panic. You cannot understand and memorize all the details of joins and filters. Instead of losing your self-confidence, focus on the overall data flow to understand the SQL query or debug it.

STEP 2) Focus on the Final Columns First!

Skip all CTEs and try to understand the last/main SELECT part. Ask yourself what is happening to all columns that are created after all those crazy CTEs and JOINs. Don’t care about…

--

--