Learning Guides
Menu

Joins Revisited

8 min readLearning SQL

Joins Revisited

You've learned the basics of joins. Now let's explore advanced techniques: complex join conditions, mixing join types, self-joins for hierarchical data, and solving real-world problems with joins.

Review: Join Types

Join TypeReturns
INNER JOINOnly matching rows from both tables
LEFT JOINAll from left + matching from right
RIGHT JOINAll from right + matching from left
FULL JOINAll from both, NULLs where no match
CROSS JOINAll combinations (Cartesian product)

Complex Join Conditions

Multiple Conditions

Joins can have multiple conditions connected with AND:

SQL
-- Find orders where the order date matches the customer's join date
SELECT
    c.first_name,
    c.last_name,
    o.order_id,
    o.order_date
FROM customers c
JOIN orders o
    ON c.customer_id = o.customer_id
    AND DATE(o.order_date) = c.join_date;

Non-Equality Joins

Joins don't require equality:

SQL
-- Match books to price ranges
CREATE TABLE price_ranges (
    range_name VARCHAR(20),
    min_price DECIMAL(8,2),
    max_price DECIMAL(8,2)
);
 
INSERT INTO price_ranges VALUES
    ('Budget', 0, 13.99),
    ('Standard', 14.00, 17.99),
    ('Premium', 18.00, 99.99);
 
SELECT
    b.title,
    b.price,
    pr.range_name
FROM books b
JOIN price_ranges pr
    ON b.price >= pr.min_price
    AND b.price <= pr.max_price;
titlepricerange_name
Animal Farm12.99Budget
Pride and Prejudice14.99Standard
198415.99Standard
Norwegian Wood16.99Standard
Kafka on the Shore17.99Standard
One Hundred Years...18.99Premium

BETWEEN in Joins

SQL
-- Same result using BETWEEN
SELECT b.title, b.price, pr.range_name
FROM books b
JOIN price_ranges pr ON b.price BETWEEN pr.min_price AND pr.max_price;

Self-Joins

A table joining to itself. Essential for hierarchical or comparative queries.

Comparing Rows Within Same Table

SQL
-- Find book pairs by the same author
SELECT
    b1.title AS book1,
    b2.title AS book2,
    b1.author_id
FROM books b1
JOIN books b2
    ON b1.author_id = b2.author_id
    AND b1.book_id < b2.book_id;  -- Avoid duplicates and self-matches
book1book2author_id
1984Animal Farm1
Norwegian WoodKafka on the Shore4

Note

The condition b1.book_id < b2.book_id ensures each pair appears only once and a book isn't paired with itself.

Finding Price Differences

SQL
-- Books that cost more than another book by the same author
SELECT
    b1.title AS expensive_book,
    b1.price AS higher_price,
    b2.title AS cheaper_book,
    b2.price AS lower_price,
    b1.price - b2.price AS price_diff
FROM books b1
JOIN books b2
    ON b1.author_id = b2.author_id
    AND b1.price > b2.price;

Hierarchical Data

Self-joins are common for organizational structures:

SQL
-- Employees and their managers (if we had an employees table)
SELECT
    e.name AS employee,
    m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

Mixing Join Types

You can combine different join types in one query:

SQL
-- All customers, their orders (if any), and order items (if any)
SELECT
    c.first_name,
    c.last_name,
    o.order_id,
    b.title,
    oi.quantity
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN books b ON oi.book_id = b.book_id
ORDER BY c.last_name, c.first_name, o.order_id;

Carol (with no orders) appears with NULLs for order, item, and book columns.

Inner Join After Left Join

Be careful—an INNER JOIN after a LEFT JOIN can eliminate the rows you wanted to keep:

SQL
-- This LOSES customers without orders!
SELECT c.first_name, o.order_id, oi.quantity
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
INNER JOIN order_items oi ON o.order_id = oi.order_id;  -- INNER removes NULLs
 
-- Keep all customers by using LEFT JOIN consistently
SELECT c.first_name, o.order_id, oi.quantity
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id;

Joining Three or More Tables

Real queries often need multiple tables:

SQL
-- Complete order report
SELECT
    c.first_name || ' ' || c.last_name AS customer,
    o.order_id,
    o.order_date,
    b.title,
    a.first_name || ' ' || a.last_name AS author,
    oi.quantity,
    oi.unit_price,
    oi.quantity * oi.unit_price AS line_total
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN books b ON oi.book_id = b.book_id
JOIN authors a ON b.author_id = a.author_id
ORDER BY o.order_date, o.order_id;

CROSS JOIN Use Cases

Though often accidental, CROSS JOINs have legitimate uses:

Generating All Combinations

SQL
-- All possible author-category combinations for a matrix
SELECT
    a.last_name AS author,
    c.category_name
FROM authors a
CROSS JOIN categories c;

With Date Ranges

SQL
-- Generate report structure for all months and products
WITH months AS (
    SELECT generate_series(
        '2024-01-01'::DATE,
        '2024-12-01'::DATE,
        '1 month'::INTERVAL
    ) AS month
)
SELECT
    m.month,
    p.product_name,
    COALESCE(SUM(s.quantity), 0) AS units_sold
FROM months m
CROSS JOIN products p
LEFT JOIN sales s
    ON s.product_id = p.product_id
    AND DATE_TRUNC('month', s.sale_date) = m.month
GROUP BY m.month, p.product_name;

Using USING and NATURAL JOIN

USING Clause

When join columns have identical names:

SQL
-- These are equivalent
SELECT * FROM books JOIN authors ON books.author_id = authors.author_id;
SELECT * FROM books JOIN authors USING (author_id);

With USING, the column appears once in the result, not twice.

NATURAL JOIN

Automatically joins on all matching column names:

SQL
SELECT * FROM orders NATURAL JOIN customers;
-- Joins on customer_id (the common column)

Warning

Avoid NATURAL JOIN in production code. Adding a column with a matching name later will silently change your query's behavior.

Lateral Joins

LATERAL joins allow the subquery to reference columns from preceding tables:

SQL
-- Top 2 most expensive books per author
SELECT
    a.first_name,
    a.last_name,
    top_books.title,
    top_books.price
FROM authors a
CROSS JOIN LATERAL (
    SELECT title, price
    FROM books b
    WHERE b.author_id = a.author_id
    ORDER BY price DESC
    LIMIT 2
) AS top_books;

This is like a correlated subquery but in the FROM clause.

Join Algorithms (How Databases Execute Joins)

Understanding join algorithms helps with performance tuning:

Nested Loop Join

  • For each row in table A, scan table B for matches
  • Fast for small tables or when B has an index
  • Slow for large unindexed tables

Hash Join

  • Build hash table from smaller table
  • Probe with larger table
  • Good for large tables, equality joins

Merge Join

  • Sort both tables on join key
  • Merge sorted results
  • Efficient when data is already sorted

Note

You don't choose the algorithm directly—the query optimizer does. But you can influence it through indexes and query structure.

Performance Tips

1. Index Foreign Keys

SQL
CREATE INDEX idx_books_author_id ON books(author_id);
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);

2. Filter Early

SQL
-- Better: filter in the join or WHERE
SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2024-01-01';
 
-- Worse: filtering in a derived table after joining everything

3. Select Only Needed Columns

SQL
-- Good
SELECT c.first_name, c.last_name, COUNT(o.order_id)
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name;
 
-- Avoid
SELECT * FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

4. Use EXPLAIN

SQL
EXPLAIN ANALYZE
SELECT * FROM books b
JOIN authors a ON b.author_id = a.author_id
WHERE b.price > 15;

Practical Examples

Example 1: Finding Orphan Records

SQL
-- Order items referencing deleted books
SELECT oi.*
FROM order_items oi
LEFT JOIN books b ON oi.book_id = b.book_id
WHERE b.book_id IS NULL;
 
-- Customers with no orders
SELECT c.customer_id, c.first_name, c.last_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

Example 2: Comparison Report

SQL
-- Compare this month's sales to last month by product
WITH current_month AS (
    SELECT book_id, SUM(quantity) AS qty
    FROM order_items oi
    JOIN orders o ON oi.order_id = o.order_id
    WHERE o.order_date >= DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY book_id
),
last_month AS (
    SELECT book_id, SUM(quantity) AS qty
    FROM order_items oi
    JOIN orders o ON oi.order_id = o.order_id
    WHERE o.order_date >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '1 month'
      AND o.order_date < DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY book_id
)
SELECT
    b.title,
    COALESCE(cm.qty, 0) AS this_month,
    COALESCE(lm.qty, 0) AS last_month,
    COALESCE(cm.qty, 0) - COALESCE(lm.qty, 0) AS change
FROM books b
LEFT JOIN current_month cm ON b.book_id = cm.book_id
LEFT JOIN last_month lm ON b.book_id = lm.book_id
ORDER BY change DESC;

Example 3: Full Customer Order History

SQL
SELECT
    c.customer_id,
    c.first_name,
    c.last_name,
    COUNT(DISTINCT o.order_id) AS order_count,
    COUNT(oi.order_item_id) AS items_purchased,
    COALESCE(SUM(oi.quantity), 0) AS total_units,
    COALESCE(SUM(oi.quantity * oi.unit_price), 0) AS total_spent,
    STRING_AGG(DISTINCT b.title, ', ' ORDER BY b.title) AS books_purchased
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
LEFT JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN books b ON oi.book_id = b.book_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC;

What's Next

The next chapter covers conditional logic with CASE expressions—adding if/then/else logic directly into your SQL queries.