Member-only story
Featured
Why Your BigQuery SQL Query Will Or Won’t Run
Delve into the subtle technical and situational factors that determine whether your SQL query runs, stalls or fails.
Late on a Thursday night, at Baltimore Washington International Airport, I anxiously watched my flight information. While my flight was on time, but my Google Cloud instance was running a query that had already consumed nearly 2 hours.
I decided that if it didn’t produce results or display an error message before my boarding group was called, I would shut down my laptop and give up. We had an automatic query canceler script for jobs exceeding 2 hours, so I was prepared to let this one be killed with quiet dignity.
This query was quickly becoming infamous among the data engineering and data analysis teams. I’ve tackled tough queries before, but this one was different due to the scope of data it needed to access and the rigidity of requirements that brought it into existence. If you’ve read my work in this and other publications, you’ll notice that my previous SQL insights focused on improving query performance. Disassembling and rebuilding problematic queries to enhance performance is performance tuning.
However, I want to revisit the fundamentals and simply talk about the subtle and obvious factors directly impacting the ultimate decider of query quality and viability: Execution.