Member-only story
Learning SQL the Hard Way
By writing it
A Data Scientist who doesn’t know SQL is not worth his salt.
And that seems correct to me in every sense of the world. While we feel much more accomplished creating models and coming up with the different hypotheses, the role of data munging can’t be understated.
And with the ubiquitousness of SQL when it comes to ETL and data preparation tasks, everyone should know a little bit of it to at least be useful.
I still remember the first time I got my hands on SQL. It was the first language (if you can call it that) I learned. And it made an impact on me. I was able to automate things, and that was something I hadn’t thought of before.
Before SQL, I used to work with Excel — VLOOKUPs and pivots. I was creating reporting systems, doing the same work again and again. SQL made it all go away. Now I could write a big script, and everything would be automated — all the crosstabs and analysis generated on the fly.
That is the power of SQL. And though you could do anything that you do with SQL using Pandas, you still need to learn SQL to deal with systems like HIVE, Teradata and sometimes Spark too.
This post is about installing SQL, explaining SQL and running SQL.