Why SQL and How do we use it?

The importance of SQL in Data and Business Professionals

Alifia C Harmadi
Analytics Vidhya
6 min readFeb 1, 2022

--

Photo by Annie Spratt on Unsplash

SQL or Structured Query language is the standard language for dealing with Relational Databases. Most of the time, SQL is used to retrieve data, update the tables, or operate on the structure of the database or tables.

Why is it important?

The majority of world’s raw data is organised in collections of tables that called relational databases. This world’s raw data can be the data of online shopping records, banking system, library management system, electronic medical records, and many more. In order to get this data from the databases in an effective way, knowing on how to used the SQL is required. SQL can help to wrangle and extract data from these databases.

Almost all professionals that works with data need to know how to use SQL such as Database Developer, Data Engineer, Data Analyst, Business Intelligence, Data Scientist, and other jobs that interact with databases in their daily tasks.

How do we use it?

Before jump to how do we write a query in SQL. It is better to know some command in SQL.

Command Description

CREATE -> Creates a new table, a view of a table, or another object in the database.

ALTER -> Modifies an existing database object, such as a table.

DROP ->Deletes an entire table, a view of a table or other objects in the database.

SELECT -> Retrieves certain records from one or more tables.

INSERT ->Creates a record.

UPDATE -> Modifies a record.

DELETE -> Deletes a record.

These SQL commands are mainly categorised into four categories as:

  1. DDL — Data Definition Language
  2. DQl — Data Query Language
  3. DML — Data Manipulation Language
  4. DCL — Data Control Language
Image by geeksforgeeks

In here, I will mainly explain about DQL with a bit of DDL and DML as these are the commands categories that I usually use and had used before. These are also probably the most common categories that have commands which people use when they start to learn SQL.

DDL is used to create, modify, and delete database structures but not data. Database Administrator and Database Developer could be mostly use this command category in their day to day tasks.

DML is used to store, modify, retrieve, delete and update data in a database. Similar to DDL, Database Administrator and Database Developer use this command category, yet it is not ruled out that other roles could use this command as long as they have an access to do it.

DQL is used for performing queries on the data within schema objects. Almost all data roles use this command category -from Database Developer/Engineer to Business Intelligence and Data Analyst. DQL is often considered part of DML, when adding FROM or WHERE data manipulators to the SELECT statement.

If you would like to know more of this and the other two categories, you can visit SQL | DDL, DQL, DML, DCL and TCL Commands — GeeksforGeeks!

Here, I will explain you the clauses from DQL that widely used when querying data using SQL. Before that, let us know the different between SQL commands and SQL clauses. Suppose that a query as a statement. The command it self is the statement. Yet, in order to have a complete statement, we need to have a condition of the statement. This condition is the clause. Once the statement completes, we only need to execute this statement to get what we want. Thus, we should have SQL commands function to make this works. In DQL case, the command is SELECT.

SQL Clauses in DQL & DML command category:

Below are the clauses that I, personally, use for my daily tasks.

1. DISTINCT

This used to return unique value only from a table as a column often contains many duplicate values.

2. FROM

This is a mandatory clause in SELECT statement to specifies the tables of data that we want to have. For instance, we want to know list of regions in the data. Thus, we need to specify the table of Country using FROM

2. WHERE

This lets you select rows based on a boolean expression. Only rows for which the expression evaluates to TRUE are returned in the result. Take an example below, where we only need a data of country that have recycling percentage more than 50%.

4. GROUP BY

This groups a result into subsets that have matching values for one or more columns. We typically will use a GROUP BY clause in conjunction with an aggregate expression.

5. HAVING

This restricts the results of a GROUP BY. Having clause is actually similar with WHERE clause, but this one is for aggregated columns.

6. ORDER BY

This allows you to specify the order in which rows appear in the result set. Using the example from Having clauses, we want to see customer with the highest number of orders first.

7. AS

This provides an alias for each column and table. Imagine if you have a quite long table name or you need to put a new name for aggregated columns like the example in Having and ORDER BY, we need to use AS to make it short and easy to read.

8. LIMIT

This uses to set a limitation of the maximum number of rows that we will get in the result. Example, we want to get the top 5 customers who have the highest number of orders using the same query example from ORDER BY.

9. FILTER

This is a modifier used on an aggregate function to limit the values used in an aggregation. Example, we want to know the number of item in the shop for each category.

10. JOIN

This is used to combine rows from two or more tables, based on a related column between them. There are several kind of JOIN that we can use in SQL .

Image by Google

11. WITH

This is used to give a sub-query block a name. Example creating sub-query using query sample in JOIN clause.

Conclusion

Knowing and having the SQL skill is essential if you want to have a career in data field because the data that you will deal with in the workplace is not in a spreadsheet like the one that you get on Kaggle for a competition. Therefore, start your learning from now!

I will probably continue to talk about SQL in the next story talking about where can you practice your SQL skills when you are still a student or still not getting a job yet. So, stay tuned for more updates!!

--

--

Alifia C Harmadi
Analytics Vidhya

A philomath & data engineer. Passionate about ML, DL & data storytelling.