Making Sense of SQL | Part 1

Spencer Wightman
3 min readMay 26, 2022

--

SQL reading my statements

This article series is for anyone beginning to learn PostgreSQL. It focuses on problem solving and understanding why basic, increasingly complex statements don’t evaluate as expected. Part 2 examines constraints and functions, Part 3 clauses, with joins and duplication all the way through.

food
name | quality | weight
-----------+---------+--------
Apple | ripe | 50
Bread | fresh | 10
Cheese | AAA | 25
Pepperoni | fresh | 25
(4 rows)
recipes
name | time | temp | ingredients
-----------+------+------+-------------
Apple Pie | 5 | 475 | 4
Calzone | 65 | 400 | 7
Pizza | 65 | 475 | 7
(3 rows)

SELECT name FROM food; The declarative logic is brutishly straightforward. We read column ‘name’ from table ‘food’. A good start.

SELECT name FROM food, recipes; This doesn’t quite work. Both tables have a column ‘name’, so SQL is confused. The desire to view column ‘name’ from both tables does not follow. We must specify the table: SELECT food.name, recipes.name FROM food, recipes;. We do not have to specify the table when the column names are unique across tables. In any case, the returned table has duplicate items. This occurs even when we remove recipes.name:

name    
-----------
Apple
Bread
Cheese
Pepperoni
Apple
Bread
Cheese
Pepperoni
Apple
Bread
Cheese
Pepperoni
(12 rows)

We can get around this problem by using clause DISTINCT to return unique rows: SELECT DISTINCT food.name, recipes.name FROM food, recipes;

name       |   name    
-----------+-----------
Apple | Apple Pie
Cheese | Pizza
Cheese | Calzone
Apple | Calzone
Bread | Pizza
Bread | Apple Pie
Bread | Calzone
Cheese | Apple Pie
Pepperoni | Apple Pie
Pepperoni | Pizza
Pepperoni | Calzone
Apple | Pizza
(12 rows)

Twelve distinct rows…showing every possible combination of food.name and recipes.name values: effectively a CROSS JOIN or Cartesian Product.

Maybe adding a condition will help:SELECT food.name, recipes.name FROM food, recipes WHERE food.quality = 'fresh' AND recipes.temp = 475;

name       |   name    
-----------+-----------
Bread | Apple Pie
Pepperoni | Apple Pie
Bread | Pizza
Pepperoni | Pizza
(4 rows)

Duplication still occurs because SQL evaluates FROM before clause WHERE.

We can try using clause LIMIT to filter our results: SELECT food.name, recipes.name FROM food, recipes WHERE food.quality = 'fresh' AND recipes.temp = 475 LIMIT 2;

name       |   name    
-----------+-----------
Bread | Apple Pie
Pepperoni | Apple Pie
(2 rows)

This is worse than DISTINCT! Values are missing because, as with WHERE, the two row LIMIT applies to the FROM Cartesian Product.

At this point, it seems prudent to link the two tables. There are no common values, so let’s alter the tables to add one: ALTER TABLE food ADD COLUMN id serial; (and again for recipes). Now we can try a different WHERE condition: SELECT food.name, recipes.name FROM food, recipes WHERE food.id = recipes.id;

name    |   name    
--------+-----------
Apple | Apple Pie
Bread | Calzone
Cheese | Pizza
(3 rows)

Pepperoni is missing because it is the 4th food row — recipes has 3 rows, so there is no match.

LEFT JOIN should work well here: SELECT food.name, recipes.name FROM food LEFT JOIN recipes ON recipes.id = food.id;

name       |   name    
-----------+-----------
Apple | Apple Pie
Bread | Calzone
Cheese | Pizza
Pepperoni |
(4 rows)

Success! The LEFT JOIN allows the first table entered in the statement to display all rows, while requiring the second table to meet the ON condition. It accomplishes this by assigning null in place of second table values that cannot meet the ON condition.

Finally, let’s clarify the return table by explicitly naming its columns: SELECT food.name AS "Food Names", recipes.name AS "Recipe Names" FROM food LEFT JOIN recipes ON recipes.id = food.id;

Food Names  | Recipe Names 
------------+--------------
Apple | Apple Pie
Bread | Calzone
Cheese | Pizza
Pepperoni |
(4 rows)

This ends the first part of our SQL journey. Thank you for reading! You can continue with keys and constraints in Part 2. Please note: the SQL statement single line format in this article is to preserve space.

--

--