SQL Quality Assurance Queries

How to construct queries in SQL to confirm data accuracy.

Zach Quinn
Learning SQL

--

A dart board target.
Photo by Afif Kusuma on Unsplash

My most time-consuming task at work is not creating ingestion pipelines, writing complex queries or AirFlow orchestration.

Each of these tasks results in a clear, observable output generated by Python or a similar engine.

Given a reasonable amount of clarity, technical knowledge and domain awareness, these tasks can be completed in a few hours or a few days.

Incidentally, my most time-consuming task as a data engineer is a technically simple task: Quality assurance.

The reason quality assurance consumes more time and resources is because, unlike other tasks which often have a base level expectation of functionality, quality assurance has a high-level expectation of accuracy.

When data science students begin learning the disciplines of data science, data analysis, data engineering and data architecture, there is an assumption that any output, so long as it matches the book or an instructor’s expectations is the correct output.

However, to maintain a professional quality of accuracy and reliability, data practitioners have to be incredibly skeptical in their assessments and thorough in their QA procedures.

--

--

Zach Quinn
Learning SQL

Journalist—>Sr. Data Engineer; new stories every Monday.