SQL for Data Scientists, in Under 6 Minutes
An Essential Skill for Any Data Science Résumé
Data scientists often work with DataFrames, be it in R or Python. However, large amounts of data — the vast amounts in today’s data science, ‘Big Data’ — simply can’t be completely loaded into a DataFrame or even into a .csv
file. These are stored in massive databases, of which a very common one is a SQL database.
SQL is remarkably simple and easy to learn, and for a data scientist who is familiar with DataFrame operations, is just a matter of learning syntax. Because of its popularity, SQL has integration with numerous applications, from pandas
for data analysis to PHP for front-end connections.
Learning SQL is such an easy and useful skill to add to a résumé that it would be almost wasteful not to spend a few minutes learning it.
This article will cover enough of SQL to perform ordinary data science operations. Let’s get into it!
Contents
- Selecting tables and columns, selecting distinct values, selecting top values, ordering columns
- Conditional selecting, selecting rows, selecting top values
- Inserting, updating, and deleting values
- Basic Statistical Values: minimum, maximum, count…