A data science newbie's guide through SQL.

Chapter One — Introduction to SQL

Lorna Maria A
3 min readMay 7, 2019

SQL — Structured Query Language.

Choose your pronunciation: Sequel or Ess-que-el

What is SQL?

SQL is a programming language designed to manage data stored in relational databases. Relational databases are a type of database that holds records in tables with a series of keys linking each table to another. The data is structured and therefore SQL handles altering, retrieving and sometimes manipulation of structured data.

This, however, is not where the structured in SQL is derived. Structured is derived from the syntax/format of clauses in the SQL.

Where is SQL used?

Today SQL is widely used in many web frameworks and database applications. It is a highly sort after querying language because of its ease of use and logistical optimisation in the face of large databases.

Its ability to query many data points and return results in a short time is impressive.

How is SQL used in data science

As a data scientist, the process begins with obtaining data to be analysed, this data is stored in a database or a sheet. Most modern system architectures include structured databases and require the use of SQL to comb out the data you would like to analyze.

SQL is used to retrieve data from a database as specified(through queries) and can be used to store data too(creation of tables).

SQL is used to manipulate data with inbuilt functions that can do simple overall manipulation in the querying process.

SQL enables you to run tests by allowing you to create and destroy test tables.

As a data scientist having the SQL knowledge gives you an upper hand into understanding how to store and retrieve data in relational databases.

Most companies have adopted RDMS and knowledge of SQL to be able to retrieve data of your interest from a company database before analysing it.

What do data scientists think about SQL?

I asked two data scientists that use SQL what they think and here is what they had to say:

I love SQL because even if data is updated, I can re-run the same script without no worries, says Shel Kariuki

Oreoluwa Ogundipe says Knowing the data you need to analyse is very key but furthermore being able to query it in the best/fastest way possible that meets your needs gives you a greater edge. Unlike tools where you have to create computed dimensions which do not represent how your data is stored, with SQL, you can query your data straight away and then filter your responses as you consider fit

Thank you so much for catching up with chapter 1 of the SQL series, next week I will share about preparing yourself and your computer as a data science newbie to start writing SQL.

Feel free to share with me feedback by leaving a clap, comment or tweeting me @kalmpublication or @lornamariak

Happy Learning!😍

--

--

Lorna Maria A

Data Science | Rstats | Life and Travel | Tech Meet-ups