Picking The Wrong SQL Join Key Cost Me Weeks Of Accurate Data. Don’t Let It Happen To You.

Even if you’re choosing the correct SQL JOIN, you could still make a tiny mistake that could cost you — or your org — big time.

Zach Quinn
Pipeline: Your Data Engineering Resource

--

Once you master SQL JOINs, create a job-worthy data science portfolio. Learn how with my free project guide.

Debugging A SQL JOIN Key Mistake

One of my most frustrating SQL debugging sessions at work didn’t involve overhauling a CTE or refactoring a user defined function; instead, it all came down to one function: LOWER().

In this instance even though I chose the correct JOIN to merge the necessary tables, I overlooked a critical and undervalued component of the JOIN: The join key, or column, I had chosen to match the rows in each table.

Because I was working with STRING types, there was the possibility that the data would include STRING representations in sentence case, all lower and capital forms.

By not applying a function that would make the values uniform, I was omitting data that couldn’t be matched and, by extension, producing an output that suggested our latest change wasn’t working.

After myself, fellow engineers and even our SQL-inclined management took a look, someone finally suggested LOWER() as the welcomed but also frustratingly simple fix.

--

--