Backfill Your SQL Tables Without Breakage Before Anyone Finds Out You Were Wrong

Re-loading missing data will be one of the least glamorous but most important tasks you do as a SQL developer. Get it right.

Zach Quinn
Learning SQL

--

Man filling a hole with a shovel.
Backfilling IRL. Photo by Daniel Lincoln on Unsplash.

Why You Need to Backfill Your SQL Tables

Ugh.

Whether I find out from an alerting system or directly from a stakeholder, “ugh” is my natural reaction when I learn that we have missing data.

Like many aspects of data-oriented work, context is what determines whether your missing data is a minor headache or a three-alarm fire.

In any case, identifying and fixing missing data must be a priority of anyone who deals directly with data that is used to guide organizational decision makers because missing, incomplete or error-riddled data can impact both real-time and historical analysis.

To account for these gaps SQL developers (typically data engineers) work through a sometimes-grueling process called backfilling.

If you’re unfamiliar, backfilling is just a catch-all industry term used to describe the CRUD processes involved with correcting incomplete or incorrect data after it should have been loaded.

--

--

Zach Quinn
Learning SQL

Journalist—>Sr. Data Engineer; new stories every Monday.