“Your SQL is not like my SQL”
One candidate I was interviewing for an analytics position, blurted out the following sentence:
Your SQL is not like my SQL
In the database world SQL is near-ubiquitous to the point that now even Excel does SQL, this predominance leads to a few issues in identifying what knowing SQL really means. Some people think knowing SQL is just a SELECT * away, others think it is a little bit more than that.
In my view there is three schools of SQL, Transactional, Analytical & the data engineering school, which while not mutually exclusive will tend to stress some particular features over others.
Transactional SQL: Is mostly for Software Engineer working on production databases, who need to ensure consistency in high frequency environment. If you are applying for this kind of position it will be necessary to understand beside basic SQL such concepts as transaction locks and triggers.
Analytical SQL: The SQL of Analysts, broad enough is so misunderstood that I believe needs to be split in two:
- Basic Analytical SQL: Which deals with the proper use of table’s granularity and aggregations. For positions requiring Basic analytical SQL, a good knowledge of subqueries and joins is necessary. Some of the more complex use case of basic analytical SQL deal with issues such as value allocation or attributions.
- Advanced Analytical SQL: Essentially covers less mainstream functions and techniques, such as Window Functions, List-Aggregation, Recursive CTE & Hierachical queries, REGEXP as well as proprietary SQL extension such as Oracle’s model clauses. Some of the use cases of these techniques are re-sessionization, basket analysis and path analysis.
- Optimisation & Fine Tuning: It is essential for data engineers and DBA to be able to properly optimize queries, especially when dealing with large datasets.To this end being able to read and act on an explain plan and act on it become quickly necessary.
- Table Design & Table creation: Data Engineers need to create tables all the times, their structures need to be efficient and for that Knowing the different types of indices, when to use partitions or buckets and when to use them become the tools of the trade.
- Stored Procedures & Procedural SQL: Data Engineers need to know the extensions of the SQL languages such as PL/SQL that allows for procedural programming as well as the use of cursors, declarative variables or dynamic sql.
- UDF: Sometimes … what you are looking for doesn’t exists and someone has to create it. Data Engineers would sometimes need to write custom user defined function for their or their customers’ use cases.