SQL Data Interview Series: Joins in SQL

Maria-Goretti Anike
8 min readDec 9, 2023

--

A SQL query goes into a bar, walks up to two tables and asks….

Image from Codingsight

Hey yo, data nerds. It’s Maria, your favourite Data Explorer 😄. Welcome to Day 7 of the SQL Data Analyst Interview Questions series. You can read up yesterday’s topic — String Functions in SQL — here. Today, we should be resting. Well, I should be resting.

That’s right. Normally, every last day in the series is used to rehash the discussed topics of the previous 6 days, and prepare for the next week’s series, in this case the Excel series. So, today was supposed to be for summary and reiteration and of course, rest. But then…

So, here we are discussing Joins as agreed. You all should thank Sai for this one. 😒😒 Anyway, let’s get into it.

What are Joins in SQL?

A join is a way to combine multiple tables into a single output. JOINs are used to combine rows from two or more tables, based on a related column between them. For example, how people from different races, genders, beliefs and career paths have discussions on football. These people might have little to no similarities or connections, but they can all come together and have a friendly discussion based on their mutual love and interest in one thing- football. That’s how joins are.

In SQL (Server), we have:

  • INNER JOIN • OUTER JOIN • CROSS JOIN

OUTER JOINs are further divided into:

  • LEFT [OUTER] JOIN • RIGHT [OUTER] JOIN • FULL [OUTER] JOIN

You know how you write a word so many times it starts to look incorrect to you? That’s how this ‘outer’ looks to me now. Outer Outer Outer. Ugh!

Before we proceed, take a look at these Venn diagrams illustrating different types of joins.

  1. We’ll start with the INNER JOIN. This returns only matching rows between two tables. Non-matching rows are eliminated. E.g., we have 2 tables: ‘Customers’ and ‘Products’, which have a similar column: Customers’ ProductID is Products’ ID.

Now, if we perform an inner join on these two tables…

…we’ll see that Customers 9 & 10 aren’t included. Why? Because in inner joins, non-matching rows are eliminated. The ProductIDs of 9 & 10 do not tally with that in the Products table, as such, there’s nothing to match them with. Additionally, Product 4 has been eliminated, too, as none of the customers have that in their ProductID, therefore there’s no row to match it with. Inner joins will only give matching rows between both tables involved. INNER JOIN = Only matching rows between the two tables.

2. LEFT [OUTER] JOIN: This returns all records from the left table (Customers) and the matching records from the right table (Products). Note: To be clear, your left table is the first table mentioned in the query (table A), while the right is the second (table B). So this can be interchanged.

Here, we see that all the records in our table A have been returned, but look at №s 9 & 10. Everything under the Products table for these numbers is NULL, given that they have no similarity with the Products table, which in turn does not recognize them. LEFT [OUTER] JOIN= Matching rows from the two tables + Non-matching rows from the left table.

3. RIGHT [OUTER] JOIN: This returns all records from the right table (Products) and the matching records from the left table (Customers).

We notice here that this is the first time we get to see Product 4. This is because in the Customers table, no customer has the ProductID 4, therefore Product 4 has no match with the Customers table. But since this is the right join, all records from table B have been returned. Also, Customers 9 & 10 are nowhere to be found, as both of them have no match with the Products table. RIGHT [OUTER] JOIN= Matching rows from the two tables + Non-matching rows from the right table.

4. FULL [OUTER] JOIN: This returns all rows from both the left and right tables, including the non-matching rows.

Here, we see all the rows from both tables together for the first (and possibly the last) time. Awwn 😊🤝🏾. As you can see, there’s an eleventh row ‘cos Product 4 wants and has nothing to do with these customers, i.e. 8 matching rows + 2 non-matching from Customers + 1 non-matching from Products. FULL [OUTER] JOIN= All rows.

5. CROSS JOIN: Also known as a cartesian JOIN, the CROSS JOIN retrieves all combinations of rows from each table. In this type of JOIN, the result set is returned by multiplying each row of table A with all rows in table B if no additional condition is introduced. Let me try to break this down with our two tables. In table A (Customers), we have 10 rows, right? Good. And in table B (Products), we have 4 rows. So, a cross join between these two is simply: Customers * Products, i.e. the 4 rows in Products will each be joined with the 10 rows in Customers. So, once Product 1 makes a round from Customer 1 to Customer 10, Product 2 will do the same, then Product 3, and finally Product 4.

As we can see here, we have 40 rows (10 customers * 4 products). Product 1- ‘Wine’ joins itself to every customer, then Product 2- ‘Sneakers’ does the same, and so on. Note: The ON clause isn’t used in CROSS JOINs. CROSS JOIN = Table A rows * Table B rows. Read up uses of this join here.

Notice in the aforementioned Venn diagram that we have inclusive and exclusive. So far, we’ve discussed only the inclusives. The exclusives are the same, only with the WHERE and IS NULL clauses. Let’s use our left join as an example. Normally, our left join brings only the matching rows, so now let’s get only the non-matching rows.

Now we’re done, time for the bonus round.

Bonus: SELF JOIN

A self-join is a regular JOIN, but the table is joined with itself. This implies that each row of the table is combined with itself and with every other row of the table. We have the inner self join, outer self join [left, right, full], and the cross self join.

  1. Let’s take it that we have a table- ‘Employees’.

Now, from this table, we’d like to get the names of the employees alongside their managers’ names.

In the INNER SELF JOIN, we only get the names of employees that have managers, as these are the matching rows.

2. Similarly to regular LEFT JOINs, all the records in table A are returned in LEFT SELF JOINs.

This shows that Kate has no manager, as she can’t be managed.

3. For the RIGHT SELF JOIN, matching rows + non-matching rows in table B are returned.

William, Sarah and Rita have no employee reporting to them, hence their ‘Employee’ column values are NULL.

4. All rows are returned for the FULL SELF JOIN.

5. All rows are multiplied by themselves in the CROSS SELF JOIN to give 25 rows.

Key Takeaways

  • INNER JOIN = Only matching rows between the two tables.
  • LEFT [OUTER] JOIN = Everything from the left + matching rows from the right.
  • RIGHT [OUTER] JOIN = Everything from the right + matching rows from the left
  • FULL [OUTER] JOIN = All rows gather round and sing Kumbaya.
  • CROSS JOIN = For some odd reason, everyone wants to be with everyone.
  • SELF JOINS = Same as regular JOINs, just a narcissistic table that wants only itself. Tsk.

Now let’s do this again, but this time with football team fans. We have the Blue Team (no name was mentioned) and the Red Team (again, no names).

  • INNER JOIN = Only close friends from the Blue & Red fan teams.
  • LEFT [OUTER] JOIN = All the fans of the Blue Team + only their friends on the Red Team.
  • RIGHT [OUTER] JOIN = All the fans of the Red Team + only their friends on the Blue Team.
  • FULL [OUTER] JOIN = Everybody together yay whatever.
  • CROSS JOIN = Every fan on the Red Team pairing with all the fans on the Blue Team singly and accordingly.
  • SELF JOINS = Only the fans of the Blue Team plus themselves again. Ugh! Tsk tsk.

Whew! That’s a wrap. I had so much fun with this, and I learnt so much on self joins while studying to write this article. Truly, the best way to learn, is to teach. Quite a paradox. So, yeah, thank you, Sai 😁. I have a better grasp on joins now, and I hope you do, too. As always, I know you learnt a thing or four from this, and it was also thoroughly enjoyable and fun to read, so give your good ol’ favourite Data Explorer lots of claps (not one; really though, clap for yourself just one time and see whether you like it) and encouraging comments. If you have any question on this or any SQL Series topic covered so far, you can ask them here and today, ‘cos from tomorrow, we’ll begin our Excel series. Hope to see you later today to officially round off our SQL Series 🤗.

Hehe 😅

--

--

Maria-Goretti Anike

Hey yo there 😄! I'm Maria, your favourite Data Explorer and ardent SQL devotee. I write all about Healthcare, Marketing, and Product Analytics.