Join Variants
Learn about advanced join techniques, such as self-joins, anti-joins, and non-equi joins, to handle complex data relationships and comparisons in SQL.
We'll cover the following...
In the world of data analysis, we often need to compare records, link them together in creative ways, or identify things that do not have a match.
Imagine we are building a product alert system: we want to notify managers if any products have not been restocked for over a month and have no suppliers listed. A simple join will not do; we need more nuanced ways to connect and contrast data.
That is where Join variants come in. In this lesson, we’ll explore advanced join patterns such as self-joins, anti-joins, and non-equi joins.
These techniques expand our problem-solving toolkit when working with complex data relationships.
By the end of this lesson, we’ll be able to:
Identify situations where advanced joins are necessary.
Use self-joins to compare rows within the same table.
Use anti-joins to find non-matching rows between tables.
Apply non-equi joins to compare rows using inequality conditions.
Pattern overview
Category:
Comparison Patterns
Intent:
To connect rows in non-standard ways by comparing values within the same table, excluding matches, or using inequality conditions.
Motivation:
Standard INNER JOIN
and LEFT JOIN
operations are powerful, but they don’t always get the job done. When we need to find unmatched records, relate rows within the same table, or compare ranges, we turn to join variants.
Also known as:
Self join: Reflexive join
Anti join: Exclusion join
Non-equi join: Inequality join
Structure
1. Self join: Used to relate a table to itself. This is helpful when rows are connected by a parent-child, referral, or comparison relationship.
Given the following structure of the Customers
table:
Field | Type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Where CustomerID
is the primary key and ReferralID
is self-referencing to CustomerID
. The table contains information about customers. Write an SQL query to show all referral relationships. Return the CustomerName
of the referrer and the CustomerName
of the referred customer.
SELECT c1.CustomerName AS Referrer,c2.CustomerName AS ReferredFROM Customers c1JOIN Customers c2 ON c1.CustomerID = c2.ReferralID;
2. Anti join: Used to find records in one table not matched by another. Typically implemented using LEFT JOIN ... WHERE other_table.column IS NULL
or NOT EXISTS
.
Given the following structure of the Products
table:
Field | Type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Where ProductID
is the primary key and CategoryID
is the foreign key from the Categories
table. The table contains information about products.
Also, the following structure of the Product_Suppliers
table:
Field | Type |
|
|
|
|
Where ProductID
is the foreign key from Products
table and SupplierID
is the foreign key from the Suppliers
table. The table contains information about suppliers and their products.
Write an SQL query to find the names of all products that do not have any suppliers assigned.
Return the ProductName
of each product.
SELECT p.ProductNameFROM Products pLEFT JOIN Product_Suppliers ps ON p.ProductID = ps.ProductIDWHERE ps.SupplierID IS NULL;
3. Non-Equi Join: Used when the join condition involves inequality (<
, >
, <=
, >=
) instead of =
. Often applied in range-based joins.
Given the following structure of the Orders
table:
Field | Type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Where OrderID
is the primary key and CustomerID
is the foreign key from the Customers
table referring to the customer who placed the order. The table contains information about the orders placed.
Also the following structure of the Suppliers
table:
Field | Type |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Where SupplierID
is the primary key. The table contains information about suppliers of products.
Write an SQL query to list the OrderID
, OrderDate
, and SupplierName
for each order-product-supplier combination where the average supplier lead time allows the product to be delivered on or before the expected delivery date.
SELECT o.OrderID, o.OrderDate, s.SupplierNameFROM Orders oJOIN Suppliers s ON o.ExpectedDeliveryDate - s.AvgLeadTime <= o.OrderDate;
Keywords
SELF JOIN
, LEFT JOIN
, NOT EXISTS
, IS NULL
, !=
, <
, >
, ON
, aliasing (table1.column
...