The Differences Between Joins
Joins in SQL can be hard to understand, at first, if you have little-to-no programming or data analysis experience. As a current student in a 7 month Backend Engineering program with no previous programming experience, I found them to be a bit challenging to conceptualize, until I looked at this picture:
I know it still looks a little confusing but let’s break it down a little and see if we can make joins a little easier to understand.
I’ll tap into my passion for comics to illustrate what joins can be used for. Let’s say we have two tables, one with Heroes and one with Superhero Teams. A Hero can belong to a team so the Heroes table will consist of 3 columns: 1) Their alias (e.g. “Wolverine”), 2) Their real name (e.g. “Logan”), and 3) the superhero team they’re most associated with (in Wolverine’s case, the “X-Men”). For good measure, we’ll throw in some heroes that usually roll solo and aren’t really associated with a particular team, like Daredevil. We will also have a team that does not have any heroes it (maybe all of them quit?). The Team table will have 3 columns as well: 1) The team name (e.g. “Avengers”), 2) their headquarters (e.g. “Avengers Mansion”), and 3) the location of said headquarters (e.g. “New York City). So here are our tables:
Inner Join also called Join:
An inner join looks at two tables and returns the data where there is a match in both tables. If you created an inner join between our two tables, you would get a table that looks something like this:
The SQL for this table would be:
SELECT hero.*, team.* FROM hero JOIN team ON hero.team_name = team.name;
The most important takeaway is that you would return a table with only heroes that have teams and only teams that have heroes. Sorry, Daredevil :(
Left Outer Join also called Left Join:
A left outer join preserves all of the left table (the Heroes) and records that have no relation in the right table (the Teams) will contain the value null. So using SQL that looks like this:
SELECT hero.*, team.* FROM hero LEFT OUTER JOIN team ON hero.team_name = team.name;
You’d get a table that looks like this:
As you can see we still have all 12 heroes , but the heroes that have no team (Daredevil, Elektra, and Ghost Rider) now have null values in the columns that correspond to the Team table.
Now say you only want to see the Heroes that run solo. All you have to do as add in a where clause.
Left Outer Join where B.key is Null
The SQL for joining the hero table with the team table but only getting back the heroes without a team this would look like this:
SELECT hero.*, team.* FROM hero LEFT OUTER JOIN team ON hero.team_name = team.name WHERE team.name = null;
which would create a table that looks like this:
As you can see, our data got a lot smaller and we only see the Heroes that are not on a team.
Right Outer Join also called Right Join:
This works the same way as Left Outer Join but preserves the right table (in our case, the Team table) and has null values in the left table (the Hero table) where no match exists.
SELECT hero.*, team.* FROM hero RIGHT OUTER JOIN team ON hero.team_name = team.name;
gives you a table that looks like this:
As you can see, all the information from all the teams is in tact, but the heroes that run solo are gone from our table and there are null values on the Hero side that correspond to the team Guardians of the Galaxy.
And, similar to left outer join, if you only wanted to see the the teams with no corresponding heroes, all you’d have to do is add in a where clause.
Right Outer Join where A.key is Null
The SQL for this would look like this:
SELECT hero.*, team.* FROM hero RIGHT OUTER JOIN team ON hero.team_name = team.name WHERE hero.team_name = null;
Now we only see teams with no heroes.
Now you’re probably thinking, what if I want to see all the heroes and all the teams? Well, that’s where Full Outer Join comes in.
Full Outer Join:
To return all the data from both the hero and the team tables you’d use a SQL command like this:
SELECT hero.*, team.* FROM hero FULL OUTER JOIN team ON hero.team_name = team.name;
and you’d get a table that looks like this:
Yay! We have all the heroes and all the teams!
But what about if I only wanted to see the heroes without teams and the teams without heroes? All you have to do is add in a where clause!
Full Outer Join where A.key is Null or B.key is Null
So to find all the heroes who do not have a teams and all the teams that don’t have hero you’d use this SQL:
SELECT hero.*, team.* FROM hero FULL OUTER JOIN team ON hero.team_name = team.name WHERE hero.team_name = null OR team.name = null;
and we’d get a table that looks like this:
Now we only see Heroes without a team and teams without a hero.
That’s all the options for joining tables. Hope this helps you understand the differences between all the different joins. Happy coding!