Introduction to SQL (Part 1)

Faisal Afif
Data Science Indonesia
5 min readOct 6, 2023

A beginner guide for learning SQL from scratch

Photo by Maxime Horlaville on Unsplash

SQL, also known as Structured Query Language and pronounced as “sequel”, is a programming language designed to query, analyze and transform data from a relational database. Due to its simplicity, SQL is one of the most powerful and widely used tools in database management system. Learning SQL is essential for anyone aiming to be a data analyst, data engineer, or data scientist.

Before learning SQL a bit further, it’s crucial that we understand what a relational database is, the structure of relational database and various “flavors” of SQL.

Relational Database

A relational database represents a collection of two or more related tables. Each table consists of rows (also known as records) and columns (also known as attributes or fields), it is similar to spreadsheet like Excel or Google Sheets. Rows represent individual instances of the entity, and columns represent the properties or attributes of the entity. Relationships between tables are established using keys. A primary key uniquely identifies each row in a table, and foreign keys establish links between tables.

For example, the following are two tables from a database of ecommerce. The first table, customers table, contains the customer_id, unique_id, zip_code, city and state. The second table, orders table, contains order_id, customer_id, order_statuts, etc. The primary key in orders table is order_id and the foreign key in orders table is customer_id, the foreign key establishes link between orders table and customers table. In customers table, customer_id is the the primary key.

All data stored in a relational database corresponds to a certain data type. The following are some of the most common data types:

  • INTEGER, a positive or negative whole number.
  • TEXT, CHAR and VARCHAR, a text string.
  • DATE, the date formatted as YYYY-MM-DD.
  • REAL or FLOAT, a decimal value.

Meanwhile, a relational database management system (RDBMS) is software that manages relational databases and allows users to update to the database through an implementation of SQL. There are various “flavors” or implementations of SQL that are specific to different database management systems. Some well-known RDBMSs and SQL flavors are as follows:

  1. MySQL utilizes the MySQL flavor. It is known for its speed and open-source nature.
  2. PostgreSQL utilizes PostgreSQL’s flavor of SQL.
  3. SQLite: SQLite uses a standard SQL syntax and is known for its lightweight nature
  4. Microsoft SQL Server utilizes T-SQL (Transact-SQL), an extension of SQL developed by Microsoft. It’s designed for enterprise-level database management.
  5. MariaDB: A MySQL fork, it uses a similar SQL syntax to MySQL.
Figure 3. Top 10 most popular database environment from Stack Overflow 2023 Developer Survey

SELECT, DISTINCT, FROM, ORDER BY and LIMIT

Now that we know what a relational database is and the various “flavors” of SQL, let’s deep dive into SQL. In SQL, we need to use SELECT and FROM clause to retrieve data from relational database, usually known as queries. Combining SELECT with DISTINCT clause will return unique values in the specific column, while adding ORDER BY clause will allow you to sort data in ascending or descending order. Combining ORDER BY with LIMIT will allow you to select the top n records that have been sorted in ascending or descending order.

In this section we will learn how to select for specific columns and all columns from a table, how to select for top 5 rows, how to select for unique values in specific column, and how to sort data based on its value in ascending or descending order in specific column.

1. Query for selecting specific column

-- Selecting column_1 and column_2 from table_1
SELECT
column_1,
column_2
FROM
table_1;

2. Query for selecting all columns

-- Selecting all columns from table_2
SELECT
*
FROM
table_2;

3. Query for selecting first 5 rows of all columns

SELECT
*
FROM
table_1
LIMIT
5;

4. Query for sorting ascending and descending

-- Sorting Ascending
SELECT
*
FROM
table_1
ORDER BY
column_1 ASC;

-- Sorting Descending
SELECT
*
FROM
table_1
ORDER BY
column_1 DESC;

5. Query for selecting unique values in a column

SELECT DISTINCT
column1
FROM
table_1;

Now we can try to answer some exercises in SQL Bolt

Figure 4. Exercise 1 in SQL Bolt

To answer the 1st, 2nd and 3rd questions in Exercise 1 of SQL Bolt you need to do these queries:

-- 1st Question
SELECT
title
FROM
movies;

-- 2nd Question
SELECT
director
FROM
movies;

-- 3rd Question
SELECT
title,
director
FROM
movies;

Now I believe you can solve 4th and 5th questions in Exercise 1 by yourself. Then we can continue to solve these questions. Are you ready?

Figure 5. Exercise 4 in SQL Bolt

To answer the 1st and 2nd questions in Exercise 4 of SQL Bolt you need to do these queries:

-- 1st Question
SELECT
DISTINCT director
FROM
movies
ORDER BY
director ASC;

-- 2nd Question
SELECT
*
FROM
movies
ORDER BY
year DESC
LIMIT
4;

Now you can try to answer the 3rd and 4th questions in Exercise 4 of SQL Bolt. If you found any difficulties or obstacles around these exercises, please leave a comment and I will do my best to assist you.

So far, we’ve already learned how to retrieve data from a relational database using the SELECT, FROM, DISTINCT, ORDER BY, and LIMIT clauses, as well as these clauses combinations. See you in the next section of a beginner guide for learning SQL from scratch.

--

--