Quick recap of SQL and SQL Joins

Ronald Wahome
6 min readAug 2, 2018

While SQL is standard, there are different flavors of the language including Oracle, Sql Server, MySql, Postgre SQL, DB2, FoxPro and Access. The main difference occurs in data types, functions, and procedure syntax.

classicmodels schema

The image above shows an example of a relational database with the various relationships between the tables. Also called a Schema

SQL operates on a Relational Database Management System (RDBMS) which is a data model for managing large collections of data. The model exists as a collection of data objects called tables that have multiple columns and rows containing related data. If you are just starting out, there is a good step by step tutorial at w3schools.com that you can work your way through the basic syntax.

Let’s see a sample line of code below.

SELECT * FROM employees;

From the code above we were able to view the contents of the data table employees. The most common syntax is the SELECT keyword which extracts data from a dataset. The asterisk (*) tells the select keyword to grab all contents FROM the table employees. This is very basic but it is a good illustration of how SQL works. For the full documentation of MySQL which is a sublanguage of SQL, check out the mysql documentation at the official website

SQL Joins

As you go deeper into SQL, one of the most important concepts to master is how to combine different tables as needed. This is especially important in a business or research setting where the wrong data is worse than no data and can result in very expensive mistakes.

The different types of SQL Joins
  1. Full Join: Return all records when there is a match in either left or right table
  2. Right Join: Return all records from the right table, and the matched records from the left table
  3. Left Join: Return all records from the left table, and the matched records from the right table
  4. Inner Join: Returns records that have matching values in both tables
  5. Cross Join: Produces a result set which is the number of rows in the first table multiplied by the number of rows in the second table
  6. Self Join: Return a table is joined with itself

We will go through a typical SQL Join by answering a typical coding challenge during a job interview. The question is based on the popular classicmodels database shown above which can be downloaded from mysqltutorial.org.

Question:

For each employee who represents customers, output the total # of orders that employee’s customers have placed alongside the total sale amount of those orders. The employee name should be output as a single column named “Sales Rep” formatted as “lastName, firstName”. The second column should be titled “# Orders” and the third should be “Total Sales”. Sort the output by Total Sales descending. Only (and all) employees with the job title ‘Sales Rep’ should be included in the output, and if the employee made no sales the Total Sales should display as “0.00”.

Answer:

As a quick reminder, the schema contains the following tables structure.

classicmodels database tree

The first thing to understand is what the question is asking. To facilitate this process, I have created a table to help us visualize the desired information. We need to output the employee, her total orders as well as total sales.

The next step is formatting the output to match the correct column labels.

Now that we understand the question, we can get to the coding part done in MySQL. The code up next should do the trick. Well, almost. It has a place holder for another code segment that is the result of joining multiple tables.

-- this code will not run
-- it has a PLACE HOLDER for a nested SELECT statement
-- ---------------------------------------------------
SELECT
salesRep AS ‘Sales Rep’,
count(quantityOrdered) AS ‘# Orders’,
SUM(totalSales) AS ‘Total Sales’
FROM ( !!PLACE HOLDER FOR JOINED TABLES!!)
GROUP BY employeeNumber
ORDER BY SUM(totalSales) DESC;

We need to join three tables to end up with a table that has employee name, number of orders and price per item. We can then get the total sales per employee by multiplying the price by number of orders to get the total sales.

Join Steps:

  1. Start with the Customer table. We need to include the entire table because we need information from all customer purchases.
  2. Right join the Employee table with the employeeNumber as the key. After this step we should end up with a table containing all customers and employees responsible for the sales.
  3. Left join the result table from step 2 to the Orders table. Left join means that we are going to keep all the information from the left table and matching information from the right. The common key for both of these tables is customerNumber.
  4. Left join the result table from step 3 to orderDetails table. At this stage we will get the amount of orders and price which ties up our final table to finally pass into the formatting SELECT statement. The key for both of these tables is orderNumber.
-- THE CODE BELOW IS FOR THE PLACE HOLDER ABOVE
-- JOINS HAPPEN IN THIS NESTED SELECT
-- ----------------------------------------------------
SELECT
e.employeeNumber,
CONCAT(e.lastName, ‘, ‘, e.firstName) AS salesRep,
od.quantityOrdered,
od.priceEach,
CASE
WHEN od.quantityOrdered IS NULL
THEN 0.00
WHEN od.quantityOrdered = 0
THEN 0.00
WHEN od.quantityOrdered > 0
THEN od.quantityOrdered * od.priceEach
END AS totalSales,
e.jobTitle
FROM customers c
RIGHT JOIN
(
SELECT employeeNumber,
lastName,
firstName,
jobTitle
FROM employees
WHERE jobTitle = ‘Sales Rep’
) AS e
ON e.employeeNumber = c.salesRepEmployeeNumber
LEFT JOIN orders o
ON o.customerNumber = c.customerNumber
LEFT JOIN orderdetails od
ON o.orderNumber = od.orderNumber) AS x

Next we group the resulting table by employeeNumber to get multiple sales per employee. Finally we order the sales by employee to get the final dollar value from the total sales associated with every employee.

GROUP BY employeeNumber
ORDER BY SUM(totalSales) DESC;

The full final code is as shown below.

SELECT 
salesRep AS ‘Sales Rep’,
count(quantityOrdered) AS ‘# Orders’,
SUM(totalSales) AS ‘Total Sales’
FROM
(
SELECT
e.employeeNumber,
CONCAT(e.lastName, ‘, ‘, e.firstName) AS salesRep,
od.quantityOrdered,
od.priceEach,
CASE
WHEN od.quantityOrdered IS NULL
THEN 0.00
WHEN od.quantityOrdered = 0
THEN 0.00
WHEN od.quantityOrdered > 0
THEN od.quantityOrdered * od.priceEach
END AS totalSales,
e.jobTitle
FROM customers c
RIGHT JOIN
(
SELECT employeeNumber,
lastName,
firstName,
jobTitle
FROM employees
WHERE jobTitle = ‘Sales Rep’
) AS e
ON e.employeeNumber = c.salesRepEmployeeNumber
LEFT JOIN orders o
ON o.customerNumber = c.customerNumber
LEFT JOIN orderdetails od
ON o.orderNumber = od.orderNumber) AS x
GROUP BY employeeNumber
ORDER BY SUM(totalSales) DESC;

The final output from our code is shown below with the column names formatted to the question’s specifications.

This was just a quick peek but SQL joins are not as complicated as one may think. However, it is important to pay attention to how the tables are joined with one another. It is possible to return a result set from a join with missing values even if the right key is used. I like to visualize the ending table and work my way backwards. There are a lot more complex joins that you can do depending on the required result but best way to get a hang of it is to practice, practice, practice.

There are more coding challenges and my implementation if you wan’t to test your skills on SQL based on the same database at my github account.

--

--

Ronald Wahome

Learn | Apply | Data Science. I write about Python, Java, Data Science & Big Data in general. linkedin.com/in/ronaldwahome