What happens when you join two tables in SQL?

7 min readNov 12, 2021

An SQL query is not code. It’s plain text that describes the data (and its structure) that you expect from the database. The DB, in a series of steps, decides how it’s going to return the data you want. It converts your plain text instructions into an executable plan.

In this post I explore the (most common) algorithms that Postgres chooses from to implement a JOIN query. I implement them in Python and run expriments on a DB of 5M-ish rows to see the choices in action. The algorithms are:

a. Nested loop join
b. (Sort) Merge join
c. Hash join

Note:
- My code isn’t all that optimised
- We don’t explore all the nitty gritties of the algos. For instance, I don’t go into depth about collision resolution in hash join

Nested Loop join

The nested loop join is the simplest implementation and works as follows:

Pick one of the tables as the “outer” table and the other as the “inner” table. When doing a LEFT JOIN, for instance, the “outer” table would be the one on the left
For each of the rows of the “outer” table, check each row of the “inner” table for matching rows.

You may have noticed that this sounds awfully like a simple nested loop

for each outer_row in outer_table:
    for each inner_row in inner_table:
        if(outer_row and inner_row satisfy condition):
            return row(s)

An implementation that does LEFT, RIGHT and INNER JOINs can be as follows:

For the LEFT JOIN on tbl1.id, the output:

[
 {'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'B'},
 {'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'BB'},
 {'id': 2, 'tbl1_field': 'AB', 'tbl2_field': 'BC'},
 {'id': 5, 'tbl1_field': 'AE', 'tbl2_field': None},
 {'id': 3, 'tbl1_field': 'AC', 'tbl2_field': None},
 {'id': 4, 'tbl1_field': 'AD', 'tbl2_field': None}
]

For the RIGHT JOIN on tbl2.id, the output:

[
 {'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'B'},
 {'id': 1, 'tbl1_field': 'A', 'tbl2_field': 'BB'},
 {'id': 2, 'tbl1_field': 'AB', 'tbl2_field': 'BC'},
 {'id': 6, 'tbl1_field': None, 'tbl2_field': 'BD'},
 {'id': 7, 'tbl1_field': None, 'tbl2_field': 'BE'}
]

Joining two tables with m and n rows:

The time complexity is O(m*n) since we’re looping n times for each row of m

Due to the inefficient nature of this algo, it’s only preferred for very small tables

(Sort) Merge join

The (sort) merge join requires that the input tables be pre-sorted on the join attribute.

The algorithm compares the two tables row by row simulataneously as follows:

Let the smaller table be the left and the larger the right. Start with cursors at the first row of each
If the records match, return the record. Advance the right cursor. Repeat until there’s no match.
If the records do not match, advance the cursor on the smaller of the two join attrs. Since the tables are sorted, it makes sense to exclude this value since it won’t have another match whatsoever
The algorithm stops once we run out of rows in the smaller table

(Sort) Merge JOIN strategy. This one took me forever to understand

We can implement a simple inner (sort) merge join follows:

Joining two tables with m and n rows:

The time complexity is a combination of the sort step (for each table) and the join step:
a) The sort step (depending on the implementation) can be O(n log n) + O(m log m)
b) For the join step, the best time is where a value x exists only once in each table. The complexity is O(m+n)

The total therefore is O(n log n + m log m)

Hash join

The hash join is usually the go-to strategy where there lacks indexing or sorted tables. The classic version works in two phases (build and probe) as follows:

Build phase
Creates a hash table using one of the tables (usually the smaller one, but not always). The key of each bucket is the (hashed) join attribute and the value, the record itself
Probe phase
For each row of the bigger tables, using the same hash function used in the “build” phase, compute a hash on the join attribute. Then check in the hash table created above for a matching hash. (Note: The hash join strategy works for equijoin predicates only i.e. tbl1.id = tbl2.id)
Since hash collisions may exist, for every match found, the join attribute in each matched record is compared to make sure the records actually match.

# Create a hash map (build)
for each record in smaller_table:
    key = hash_function(record[join_attr])
    hash_map.append({"key": key, "value": record})# Match the larger table (probe)
for each record in larger_table:
    key = hash_function(record[join_attr])
    mtch = hash_map.find(key)    if mtch exists:
        compare records
        return records

This strategy has relatively good performance when the hash table can fit into memory.

A simple INNER JOIN implementation can be as follows:

Joining two tables with m and n rows:

The time complexity is O(m) for creating the hash map plus O(n) for probing the hash map. In the average case, time complexity for a hash map lookup is O(1). Therefore, the total complexity is O(m+n)

Experimenting on Postgres

For experimentation purposes, I created three tables (and wrote them into CSV files):

persons — 3.2M rows
transactions — 1M rows
person_transactions (a junction table) — 1M rows

I then loaded them into a PostgresDB (locally hosted using Docker)

Then:

docker-compose -f compose-postgres.yml up

To delete and start afresh:

docker-compose down && sudo docker-compose down --volumes

The init SQL file:

You can get this project from my Github:

git clone https://github.com/kelvingakuo/Largeish-Data.git
cd Largeish-Data/join-strategies

For exploration of the database, I’m using DBeaver

On Postgres we can use the EXPLAIN keyword to see the DBs query plan and estimated costs. We can also add the ANALYZE keyword to compare the estimated costs and actual costs.

Experiment I

Say we wanted the name and transaction id for each person with a transaction, we would use a query like so:

SELECT p.first_name, p.last_name, pt.transaction_id_fk FROM persons p 
RIGHT JOIN person_transactions pt on p.person_id = pt.person_id_fk;

Since there’s no clear reason to use a merge join (sorted tables) or nested loop (small tables) , notice that Postgres went with a hash join as follows:

Create a hash table using a Seq Scan on the person_transactions table. 131,072 buckets were created
Seq Scan the persons table, hash each row and find a matching hash in the hash table above

Experiment II

Let’s repeat the same query but pre-sort our tables.

# Pre-sort the tables on the JOIN attr into viewscreate view sorted_persons as 
    select * from persons
    order by person_id;

create view sorted_junction as
    select * from person_transactions
    order by person_id_fk;# JOIN on the sorted viewsexplain analyze SELECT p.person_id, p.first_name, p.last_name, pt.transaction_id_fk FROM sorted_persons p 
RIGHT JOIN sorted_junction pt on p.person_id = pt.person_id_fk;

Now that we’re joining on sorted tables, notice that the DB picked a (sort) merge join.

PS: The materialize node of the query plan is due to the fact that we’re using views to pre-sort our tables. A view stores the query that’s the run (materialised) when needed i.e. we didn’t create sorted tables, we created queries that generate sorted tables when needed.

Experiment III

Say we wanted all the transactions done by the person with the ID 2529132

explain analyze SELECT p.person_id, p.first_name, p.last_name, pt.transaction_id_fk FROM persons p 
RIGHT JOIN person_transactions pt on p.person_id = pt.person_id_fk
where pt.person_id_fk = 2529132;

The following happens:

Using 2 workers, a Parallel Scan is done on the person_transactions table. The table is filtered for the ID, returning 2 rows
Since Postgres creates an index for all primary key constraint, an Index Scan is done on the persons table. The table is filtered for the ID, returning 1 row
With very few rows to JOIN, a simple nested loop join is then picked!

What happens when you join two tables in SQL?

Nested Loop join

(Sort) Merge join

Hash join

Experimenting on Postgres

Experiment I

Experiment II

Experiment III

Further reading

Written by Kelvin Gakuo