SQL — Mistakes using LEFT JOIN, Sub-query and SELECT

Junlong Wang
4 min readJul 12, 2018

--

A recent ticket did give me a good chance to look back at JOIN, Sub-query and SELECT, and most importantly, to think what happens when using them.

“A rural dirt road with a fence and tall trees on its sides” by Lukas Neasi on Unsplash

The requirement of that ticket is to add a new column in an existing SQL View that joins a number of tables. This new column needs to display the sum of the value from PurchaseOrderCosts table with particular condition related to other two tables.

The relationships between the three tables are described as below:

PurchaseOrders — || — — — — — — — — -0← PurchaseOrderCosts

Suppliers — || — — — — — — — — -0< — PurchaseOrdersCosts

PurchaseOrders →0 — — — — — — — — -|| — Suppliers

What needs to be summed is column Cost of PurchaseOrderCosts from the rows that have the same Supplier as PurchaseOrders. The original SQL script of the View has been simplified as below

SELECT po.Id,
po.OrderNumber,
po.OrderDate,
s.Id AS SupplierId,
s.SupplierName AS Supplier,
po.OrderStatus,
c.CurrencyCode AS Currency,
w.WarehouseName,
w.Id AS [WarehouseId],
ISNULL(po.Total, 0) As CurrencySubTotal,
so.OrderNumber AS SalesOrderNumber,
po.Comments AS Comments,
po.SalesOrderId AS SalesOrderId
FROM PurchaseOrders po
LEFT JOIN Suppliers s ON s.Id = po.SupplierId
LEFT JOIN Currencies c ON c.Id = po.CurrencyId
LEFT JOIN Warehouse w ON po.WarehouseId = w.Id
LEFT JOIN SalesOrders so ON po.SalesOrderId = so.Id

Mistakes

1. Use SUM clause in SELECT statement.

This could be a solution but the problem is I have to add a long list of GROUP BY value.

SELECT po.Id,
po.OrderNumber,
po.OrderDate,
s.Id AS SupplierId,
s.SupplierName AS Supplier,
po.OrderStatus,
c.CurrencyCode AS Currency,
w.WarehouseName,
w.Id AS [WarehouseId],
SUM(ISNULL(poc.Cost, 0)) as SupplierCosts,
ISNULL(po.Total, 0) As CurrencySubTotal,
so.OrderNumber AS SalesOrderNumber,
po.Comments AS Comments,
po.SalesOrderId AS SalesOrderId
FROM PurchaseOrders po
LEFT JOIN Suppliers s ON s.Id = po.SupplierId
LEFT JOIN Currencies c ON c.Id = po.CurrencyId
LEFT JOIN Warehouse w ON po.WarehouseId = w.Id
LEFT JOIN SalesOrders so ON po.SalesOrderId = so.Id
LEFT JOIN PurchaseOrderCosts poc on poc.PurchaseOrderId = po.Id and po.SupplierId = poc.SupplierId
GROUP BY po.Id, po.OrderNumber, s.SupplierName, ....2. Misunderstood Sub Query

2.1 Use LEFT JOIN If I have to sum the Costs that have the same Supplier as the PurchaseOrder, I shouldn’t have used LEFT JOIN because it selects the record matching PurchaseOrderId but not matching SupplierId to be summed. The correct JOIN is INNER JOIN.

SELECT po.Id,
po.OrderNumber,
po.OrderDate,
s.Id AS SupplierId,
s.SupplierName AS Supplier,
po.OrderStatus,
c.CurrencyCode AS Currency,
w.WarehouseName,
w.Id AS [WarehouseId],
(
SELECT SUM(ISNULL(poc.Cost,0))
FROM PurchaseOrderCosts poc
LEFT JOIN PurchaseOrders po on po.Id = poc.PurchaseOrderId and poc.SupplierId = po.SupplierId
) AS SupplierCosts,
ISNULL(po.Total, 0) As CurrencySubTotal,
so.OrderNumber AS SalesOrderNumber,
po.Comments AS Comments,
po.SalesOrderId AS SalesOrderId
FROM PurchaseOrders po
LEFT JOIN Suppliers s ON s.Id = po.SupplierId
LEFT JOIN Currencies c ON c.Id = po.CurrencyId
LEFT JOIN Warehouse w ON po.WarehouseId = w.Id
LEFT JOIN SalesOrders so ON po.SalesOrderId = so.IdThe execution of subquery looks like:
Image 1

and the execution of view looks like:

Image 2

2.2 WHERE condition

In situation of mistake 2.1, I thought I should use WHERE to filter the Costs and I wrote:

...
(
SELECT SUM(ISNULL(poc.Cost,0))
FROM PurchaseOrderCosts poc
LEFT JOIN PurchaseOrders po on po.Id = poc.PurchaseOrderId and poc.SupplierId = po.SupplierId
WHERE po.Id = poc.PurchaseOrderId
) AS SupplierCosts,
...

The problem is the LEFT JOIN has included the value that matches either PurchaseOrderId or SupplierId. When po.Id is 7, for example, the sub query result will be 200 + 30, but it should be 30 instead.

How do you solve this problem? From Image 1 we can see that the exactly matched records will have the same Id and PurchaseOrderId, so use WHERE po.Id = po2.Id. Otherwise, change LEFT JOIN to INNER JOIN.

3. LEFT JOIN with Sub Query

Use Sub Query with LEFT JOIN is to create another “Table” and join it with current tables. See the script:

SELECT po.Id,
po.OrderNumber,
po.OrderDate,
s.Id AS SupplierId,
s.SupplierName AS Supplier,
po.OrderStatus,
c.CurrencyCode AS Currency,
w.WarehouseName,
w.Id AS [WarehouseId],
ISNULL(costs.SupplierCosts, 0) AS SupplierCosts,
ISNULL(po.Total, 0) As CurrencySubTotal,
so.OrderNumber AS SalesOrderNumber,
po.Comments AS Comments,
po.SalesOrderId AS SalesOrderId
FROM PurchaseOrders po
LEFT JOIN Suppliers s ON s.Id = po.SupplierId
LEFT JOIN Currencies c ON c.Id = po.CurrencyId
LEFT JOIN Warehouse w ON po.WarehouseId = w.Id
LEFT JOIN SalesOrders so ON po.SalesOrderId = so.Id
LEFT JOIN(
SELECT SupplierId, PurchaseOrderId, SUM(ISNULL(Cost, 0)) AS SupplierCosts
FROM PurchaseOrderCosts
GROUP BY SupplierId, PurchaseOrderId
) costs on po.SupplierId = costs.SupplierId and po.Id = costs.PurchaseOrderId

The execution of sub query is:

Image 3

4. Summary

LEFT JOIN is based on the table on the left side and join the table on the right based on particular condition. Make sure to understand that the result might be different if swap the left table with the right.

LEFT JOIN will produce the same amount of rows as the left table. For some columns of each row, the value can be Null if the condition is not satisified.

Bear this in mind before filtering rows using WHERE clause, because the inproper WHERE condition will skip the unwanted rows.

Use sub query with LEFT JOIN instead of SELECT statement.

When the same table appears in and outside of the sub query, use a different alias for it in order to distinguish them.

--

--