ON DATA ENGINEERING
The path to learning SQL and mastering it to become a Data Engineer
SQL is one of the key tools used by data engineers to model business logic, extract key performance metrics, and create reusable data structures. There are, however, different types of SQL to consider for data engineers: Basic, Advanced Modelling, Efficient, Big Data, and Programmatic. The path to learning SQL involves progressively learning these different types.
What is “Basic SQL”
Learning “Basic SQL” is all about learning the key operations in SQL to manipulate the data such as aggregations, grain, and joins, for example.
Where to learn it
Basic SQL can be learned from websites such as W3C or looking for a more practical approach to learning from websites such as Datacamp or DataQuest. These websites allow us to get a decent grasp of SQL's core concepts, such as the different operations, functions, subqueries, and joins. Some of the core concepts in data engineering, such as working with the grain of a table/dataset, are often not as extensively discussed as they deserve.
One of the main challenges of learning SQL is setting up the database and access to datasets. These days installing a local database has become quite easy, but it does require some time to set up the database. After that, the tables need to be created, and datasets uploaded onto them before they can become useable for practical learning.
This type of knowledge is generally tested during screening interview questions, such as that of the histogram, to understand how candidates have grasped concepts such as granularity or joins. This type of interview question is also at the typical SQL knowledge level expected for fresh graduates embarking on data engineers' careers.
Data engineers need to be able to model complex transformations. Learning some advanced analytical SQL helps model these types of behavior. Two main things help support this kind of use case 1) Advanced Queries 2) Data Models.