What happens when you join two tables in SQL?

Kelvin Gakuo
7 min readNov 12, 2021

--

An SQL query is not code. It’s plain text that describes the data (and its structure) that you expect from the database. The DB, in a series of steps, decides how it’s going to return the data you want. It converts your plain text instructions into an executable plan.

In this post I explore the (most common) algorithms that Postgres chooses from to implement a JOIN query. I implement them in Python and run expriments on a DB of 5M-ish rows to see the choices in action. The algorithms are:

a. Nested loop join
b. (Sort) Merge join
c. Hash join

Note:
- My code isn’t all that optimised
- We don’t explore all the nitty gritties of the algos. For instance, I don’t go into depth about collision resolution in hash join

Nested Loop join

The nested loop join is the simplest implementation and works as follows:

  1. Pick one of the tables as the “outer” table and the other as the “inner” table. When doing a LEFT JOIN, for instance, the “outer” table would be the one on the left
  2. For each of the rows of the “outer” table, check each row of the “inner” table for matching rows.

You may have noticed that this sounds awfully like a simple nested loop

for each outer_row in outer_table:
for each inner_row in inner_table:
if(outer_row and inner_row satisfy condition):
return row(s)

An implementation that does LEFT, RIGHT and INNER JOINs can be as follows:

For the LEFT JOIN on tbl1.id, the output:

[
{'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'B'},
{'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'BB'},
{'id': 2, 'tbl1_field': 'AB', 'tbl2_field': 'BC'},
{'id': 5, 'tbl1_field': 'AE', 'tbl2_field': None},
{'id': 3, 'tbl1_field': 'AC', 'tbl2_field': None},
{'id': 4, 'tbl1_field': 'AD', 'tbl2_field': None}
]

For the RIGHT JOIN on tbl2.id, the output:

[
{'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'B'},
{'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'BB'},
{'id': 2, 'tbl1_field': 'AB', 'tbl2_field': 'BC'},
{'id': 6, 'tbl1_field': None, 'tbl2_field': 'BD'},
{'id': 7, 'tbl1_field': None, 'tbl2_field': 'BE'}
]

Joining two tables with m and n rows:

  1. The time complexity is O(m*n) since we’re looping n times for each row of m

Due to the inefficient nature of this algo, it’s only preferred for very small tables

(Sort) Merge join

The (sort) merge join requires that the input tables be pre-sorted on the join attribute.

The algorithm compares the two tables row by row simulataneously as follows:

  1. Let the smaller table be the left and the larger the right. Start with cursors at the first row of each
  2. If the records match, return the record. Advance the right cursor. Repeat until there’s no match.
  3. If the records do not match, advance the cursor on the smaller of the two join attrs. Since the tables are sorted, it makes sense to exclude this value since it won’t have another match whatsoever
  4. The algorithm stops once we run out of rows in the smaller table
(Sort) Merge JOIN strategy. This one took me forever to understand

We can implement a simple inner (sort) merge join follows:

Joining two tables with m and n rows:

  1. The time complexity is a combination of the sort step (for each table) and the join step:
    a) The sort step (depending on the implementation) can be O(n log n) + O(m log m)
    b) For the join step, the best time is where a value x exists only once in each table. The complexity is O(m+n)

The total therefore is O(n log n + m log m)

Hash join

The hash join is usually the go-to strategy where there lacks indexing or sorted tables. The classic version works in two phases (build and probe) as follows:

  1. Build phase
    Creates a hash table using one of the tables (usually the smaller one, but not always). The key of each bucket is the (hashed) join attribute and the value, the record itself
  2. Probe phase
    For each row of the bigger tables, using the same hash function used in the “build” phase, compute a hash on the join attribute. Then check in the hash table created above for a matching hash. (Note: The hash join strategy works for equijoin predicates only i.e. tbl1.id = tbl2.id)
    Since hash collisions may exist, for every match found, the join attribute in each matched record is compared to make sure the records actually match.
# Create a hash map (build)
for each record in smaller_table:
key = hash_function(record[join_attr])
hash_map.append({"key": key, "value": record})
# Match the larger table (probe)
for each record in larger_table:
key = hash_function(record[join_attr])
mtch = hash_map.find(key)
if mtch exists:
compare records
return records

This strategy has relatively good performance when the hash table can fit into memory.

A simple INNER JOIN implementation can be as follows:

Joining two tables with m and n rows:

  1. The time complexity is O(m) for creating the hash map plus O(n) for probing the hash map. In the average case, time complexity for a hash map lookup is O(1). Therefore, the total complexity is O(m+n)

Experimenting on Postgres

For experimentation purposes, I created three tables (and wrote them into CSV files):

  • persons — 3.2M rows
  • transactions — 1M rows
  • person_transactions (a junction table) — 1M rows

I then loaded them into a PostgresDB (locally hosted using Docker)

Then:

docker-compose -f compose-postgres.yml up

To delete and start afresh:

docker-compose down && sudo docker-compose down --volumes

The init SQL file:

You can get this project from my Github:

git clone https://github.com/kelvingakuo/Largeish-Data.git
cd Largeish-Data/join-strategies

For exploration of the database, I’m using DBeaver

Experiment DB ER diagram

On Postgres we can use the EXPLAIN keyword to see the DBs query plan and estimated costs. We can also add the ANALYZE keyword to compare the estimated costs and actual costs.

Experiment I

Say we wanted the name and transaction id for each person with a transaction, we would use a query like so:

SELECT p.first_name, p.last_name, pt.transaction_id_fk FROM persons p 
RIGHT JOIN person_transactions pt on p.person_id = pt.person_id_fk;
Hash JOIN query plan

Since there’s no clear reason to use a merge join (sorted tables) or nested loop (small tables) , notice that Postgres went with a hash join as follows:

  1. Create a hash table using a Seq Scan on the person_transactions table. 131,072 buckets were created
  2. Seq Scan the persons table, hash each row and find a matching hash in the hash table above

Experiment II

Let’s repeat the same query but pre-sort our tables.

# Pre-sort the tables on the JOIN attr into viewscreate view sorted_persons as 
select * from persons
order by person_id;

create view sorted_junction as
select * from person_transactions
order by person_id_fk;
# JOIN on the sorted viewsexplain analyze SELECT p.person_id, p.first_name, p.last_name, pt.transaction_id_fk FROM sorted_persons p
RIGHT JOIN sorted_junction pt on p.person_id = pt.person_id_fk;
(Sort) Merge JOIN query plan

Now that we’re joining on sorted tables, notice that the DB picked a (sort) merge join.

PS: The materialize node of the query plan is due to the fact that we’re using views to pre-sort our tables. A view stores the query that’s the run (materialised) when needed i.e. we didn’t create sorted tables, we created queries that generate sorted tables when needed.

Experiment III

Say we wanted all the transactions done by the person with the ID 2529132

explain analyze SELECT p.person_id, p.first_name, p.last_name, pt.transaction_id_fk FROM persons p 
RIGHT JOIN person_transactions pt on p.person_id = pt.person_id_fk
where pt.person_id_fk = 2529132;
Nested loop JOIN query plan

The following happens:

  1. Using 2 workers, a Parallel Scan is done on the person_transactions table. The table is filtered for the ID, returning 2 rows
  2. Since Postgres creates an index for all primary key constraint, an Index Scan is done on the persons table. The table is filtered for the ID, returning 1 row
  3. With very few rows to JOIN, a simple nested loop join is then picked!

Further reading

  1. The different types of Postgres scans and how Postgres decides on which one to use — https://severalnines.com/database-blog/overview-various-scan-methods-postgresql
  2. Implementation of indexes in Postgres — https://www.postgresql.org/docs/current/btree-implementation.html
  3. How to read Postgres’ EXPLAIN output — https://docs.gitlab.com/ee/development/understanding_explain_plans.html
  4. Types of JOIN predicates supported by the different algos — https://hackernoon.com/python-and-data-engineering-under-the-hood-of-join-operators

--

--

Kelvin Gakuo

Understanding how the tools we use as Data Engineers *actually* work