Learning Guides
Menu

Subqueries

8 min readLearning SQL

Subqueries

A subquery is a query nested inside another query. Subqueries let you use the result of one query as input to another, enabling complex filtering and data transformations that would otherwise require multiple steps.

Subquery Basics

A subquery is enclosed in parentheses and can appear in several places:

SQL
-- In WHERE clause
SELECT * FROM books
WHERE author_id IN (SELECT author_id FROM authors WHERE nationality = 'British');
 
-- In FROM clause (derived table)
SELECT * FROM (SELECT * FROM books WHERE price > 15) AS expensive_books;
 
-- In SELECT clause (scalar subquery)
SELECT title, (SELECT MAX(price) FROM books) AS max_price FROM books;

Subqueries in WHERE Clauses

Single-Value Subqueries

When a subquery returns exactly one value, use it with comparison operators:

SQL
-- Find books priced above average
SELECT title, price
FROM books
WHERE price > (SELECT AVG(price) FROM books);
titleprice
One Hundred Years of Solitude18.99
Norwegian Wood16.99
Kafka on the Shore17.99
SQL
-- Find the most recent order
SELECT * FROM orders
WHERE order_date = (SELECT MAX(order_date) FROM orders);

Warning

If a subquery used with =, >, < etc. returns more than one row, you'll get an error. Use IN, ANY, or ALL for multi-row results.

Multi-Value Subqueries with IN

When a subquery returns multiple values, use IN:

SQL
-- Books by British authors
SELECT title, price
FROM books
WHERE author_id IN (
    SELECT author_id
    FROM authors
    WHERE nationality = 'British'
);
titleprice
198415.99
Animal Farm12.99
Pride and Prejudice14.99
SQL
-- Customers who have placed orders
SELECT first_name, last_name
FROM customers
WHERE customer_id IN (
    SELECT DISTINCT customer_id FROM orders
);

NOT IN for Exclusion

SQL
-- Books that have never been ordered
SELECT title
FROM books
WHERE book_id NOT IN (
    SELECT DISTINCT book_id FROM order_items
);

NOT IN and NULL

If the subquery returns any NULL values, NOT IN returns no results! This is because x NOT IN (1, 2, NULL) evaluates to UNKNOWN.

SQL
-- Safer: use NOT EXISTS instead
SELECT title FROM books b
WHERE NOT EXISTS (
    SELECT 1 FROM order_items oi WHERE oi.book_id = b.book_id
);

ANY and ALL Operators

ANY (or SOME) returns true if the comparison is true for at least one value:

SQL
-- Books cheaper than ANY Murakami book
SELECT title, price
FROM books
WHERE price < ANY (
    SELECT price FROM books WHERE author_id = 4
);

ALL returns true if the comparison is true for every value:

SQL
-- Books cheaper than ALL Murakami books
SELECT title, price
FROM books
WHERE price < ALL (
    SELECT price FROM books WHERE author_id = 4
);

Correlated Subqueries

A correlated subquery references the outer query. It executes once for each row in the outer query.

SQL
-- Books priced above their author's average
SELECT b.title, b.price, b.author_id
FROM books b
WHERE b.price > (
    SELECT AVG(b2.price)
    FROM books b2
    WHERE b2.author_id = b.author_id  -- References outer query
);

The subquery WHERE b2.author_id = b.author_id correlates with the outer query—it calculates a different average for each author.

EXISTS and NOT EXISTS

EXISTS tests whether a subquery returns any rows:

SQL
-- Customers who have placed at least one order
SELECT c.first_name, c.last_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

NOT EXISTS finds rows with no matching subquery results:

SQL
-- Authors with no books in our catalog
SELECT a.first_name, a.last_name
FROM authors a
WHERE NOT EXISTS (
    SELECT 1
    FROM books b
    WHERE b.author_id = a.author_id
);

Note

EXISTS only checks for existence—it doesn't matter what columns the subquery selects. SELECT 1, SELECT *, or SELECT column all work the same.

Derived Tables (Subqueries in FROM)

A subquery in the FROM clause creates a temporary table for the outer query:

SQL
-- Average of per-author book counts
SELECT AVG(book_count) AS avg_books_per_author
FROM (
    SELECT author_id, COUNT(*) AS book_count
    FROM books
    GROUP BY author_id
) AS author_counts;

Warning

Derived tables MUST have an alias in most databases. The AS author_counts is required.

Practical Uses of Derived Tables

SQL
-- Top customers with their ranking
SELECT
    customer_name,
    total_spent,
    RANK() OVER (ORDER BY total_spent DESC) AS spending_rank
FROM (
    SELECT
        CONCAT(c.first_name, ' ', c.last_name) AS customer_name,
        COALESCE(SUM(o.total_amount), 0) AS total_spent
    FROM customers c
    LEFT JOIN orders o ON c.customer_id = o.customer_id
    GROUP BY c.customer_id, c.first_name, c.last_name
) AS customer_totals;

Scalar Subqueries in SELECT

A scalar subquery returns exactly one value and can appear in the SELECT clause:

SQL
SELECT
    title,
    price,
    (SELECT AVG(price) FROM books) AS avg_price,
    price - (SELECT AVG(price) FROM books) AS diff_from_avg
FROM books;
titlepriceavg_pricediff_from_avg
198415.9916.32-0.33
Animal Farm12.9916.32-3.33
............

Correlated Scalar Subqueries

SQL
-- Each book with its author's total book count
SELECT
    b.title,
    b.author_id,
    (SELECT COUNT(*) FROM books b2 WHERE b2.author_id = b.author_id) AS author_book_count
FROM books b;
titleauthor_idauthor_book_count
198412
Animal Farm12
Pride and Prejudice21
Norwegian Wood42
Kafka on the Shore42

Common Table Expressions (CTEs)

CTEs provide a cleaner alternative to derived tables:

SQL
WITH author_stats AS (
    SELECT
        author_id,
        COUNT(*) AS book_count,
        AVG(price) AS avg_price
    FROM books
    GROUP BY author_id
)
SELECT
    a.first_name,
    a.last_name,
    s.book_count,
    s.avg_price
FROM authors a
JOIN author_stats s ON a.author_id = s.author_id;

Multiple CTEs

SQL
WITH
    order_totals AS (
        SELECT customer_id, SUM(total_amount) AS total_spent
        FROM orders
        GROUP BY customer_id
    ),
    customer_ranks AS (
        SELECT
            customer_id,
            total_spent,
            RANK() OVER (ORDER BY total_spent DESC) AS rank
        FROM order_totals
    )
SELECT
    c.first_name,
    c.last_name,
    cr.total_spent,
    cr.rank
FROM customers c
JOIN customer_ranks cr ON c.customer_id = cr.customer_id;

Recursive CTEs

For hierarchical data like organizational charts or category trees:

SQL
-- Example: Employee hierarchy
WITH RECURSIVE employee_tree AS (
    -- Base case: top-level managers
    SELECT employee_id, name, manager_id, 0 AS level
    FROM employees
    WHERE manager_id IS NULL
 
    UNION ALL
 
    -- Recursive case: employees under managers
    SELECT e.employee_id, e.name, e.manager_id, et.level + 1
    FROM employees e
    JOIN employee_tree et ON e.manager_id = et.employee_id
)
SELECT * FROM employee_tree ORDER BY level, name;

Subquery vs. Join

Many subqueries can be rewritten as joins:

SQL
-- Subquery version
SELECT title FROM books
WHERE author_id IN (
    SELECT author_id FROM authors WHERE nationality = 'British'
);
 
-- Join version
SELECT b.title
FROM books b
JOIN authors a ON b.author_id = a.author_id
WHERE a.nationality = 'British';

When to Use Which?

Use Subquery When...Use Join When...
You need values from one table onlyYou need columns from multiple tables
Logic is clearer with nested structurePerformance is critical (often faster)
Using EXISTS/NOT EXISTSJoining on multiple conditions
Computing aggregates for filteringMany-to-many relationships

Practical Examples

Example 1: Above-Average Analysis

SQL
-- Books priced above their category's average
WITH category_avg AS (
    SELECT
        author_id,
        AVG(price) AS avg_price
    FROM books
    GROUP BY author_id
)
SELECT
    b.title,
    b.price,
    a.avg_price AS author_avg,
    b.price - a.avg_price AS above_avg
FROM books b
JOIN category_avg a ON b.author_id = a.author_id
WHERE b.price > a.avg_price;

Example 2: Customers Without Recent Orders

SQL
-- Customers who haven't ordered in the last 30 days
SELECT first_name, last_name, email
FROM customers c
WHERE NOT EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.customer_id
      AND o.order_date > CURRENT_DATE - INTERVAL 30 DAY
);

Example 3: Running Totals with Subquery

SQL
-- Cumulative order totals per customer
SELECT
    o.order_id,
    o.customer_id,
    o.order_date,
    o.total_amount,
    (
        SELECT SUM(o2.total_amount)
        FROM orders o2
        WHERE o2.customer_id = o.customer_id
          AND o2.order_date <= o.order_date
    ) AS running_total
FROM orders o
ORDER BY o.customer_id, o.order_date;

Example 4: Percentile Ranking

SQL
-- Find books in the top 25% by price
SELECT title, price
FROM books b
WHERE price >= (
    SELECT price FROM books
    ORDER BY price DESC
    LIMIT 1 OFFSET (SELECT CEIL(COUNT(*) * 0.25) - 1 FROM books)
)
ORDER BY price DESC;

What's Next

The next chapter revisits joins with more advanced techniques—self-joins, multiple join types in one query, and complex join conditions.