Learning Guides
Menu

Working with Sets

8 min readLearning SQL

Working with Sets

SQL is built on set theory. Beyond joins, SQL provides set operators that combine the results of multiple queries. These operators—UNION, INTERSECT, and EXCEPT—treat query results as mathematical sets.

Set Operations vs. Joins

Joins combine columns from different tables horizontally:

PLAINTEXT
Table A      JOIN      Table B
[col1, col2]   +    [col3, col4]   =   [col1, col2, col3, col4]

Set operations combine rows vertically:

PLAINTEXT
Query A results
   +
Query B results
   =
Combined rows

For set operations to work, both queries must have the same number of columns with compatible data types.

UNION: Combining Results

UNION combines results from two queries and removes duplicates:

SQL
-- All names from both customers and authors
SELECT first_name, last_name FROM customers
UNION
SELECT first_name, last_name FROM authors;
first_namelast_name
AliceChen
BobMartinez
CarolTaylor
GeorgeOrwell
JaneAusten
GabrielGarcía Márquez
HarukiMurakami

UNION ALL: Keep Duplicates

UNION ALL keeps all rows, including duplicates:

SQL
SELECT first_name, last_name FROM customers
UNION ALL
SELECT first_name, last_name FROM authors;

Note

UNION ALL is faster than UNION because it doesn't need to check for and remove duplicates. Use UNION ALL when you know there are no duplicates or when duplicates are acceptable.

Practical UNION Example

SQL
-- Create a unified contact list from multiple sources
SELECT
    'Customer' AS contact_type,
    first_name,
    last_name,
    email
FROM customers
 
UNION
 
SELECT
    'Author' AS contact_type,
    first_name,
    last_name,
    NULL AS email  -- Authors don't have email in our schema
FROM authors
 
ORDER BY last_name, first_name;
contact_typefirst_namelast_nameemail
JaneAustenAuthorNULL
AliceChenCustomeralice.chen@email.com
GabrielGarcía MárquezAuthorNULL
............

INTERSECT: Finding Common Rows

INTERSECT returns only rows that appear in both query results:

SQL
-- Find first names that appear in both customers and authors
SELECT first_name FROM customers
INTERSECT
SELECT first_name FROM authors;

If a customer named "George" existed, "George" would appear in the result.

Practical INTERSECT Example

SQL
-- Books that have been ordered AND are currently in stock
SELECT book_id, title FROM books WHERE stock_quantity > 0
INTERSECT
SELECT DISTINCT b.book_id, b.title
FROM books b
JOIN order_items oi ON b.book_id = oi.book_id;

Warning

MySQL doesn't support INTERSECT directly. Use a JOIN or EXISTS subquery instead:

SQL
-- MySQL alternative to INTERSECT
SELECT DISTINCT b.book_id, b.title
FROM books b
JOIN order_items oi ON b.book_id = oi.book_id
WHERE b.stock_quantity > 0;

EXCEPT: Finding Differences

EXCEPT (called MINUS in Oracle) returns rows from the first query that don't appear in the second:

SQL
-- Find books that have never been ordered
SELECT book_id, title FROM books
EXCEPT
SELECT DISTINCT b.book_id, b.title
FROM books b
JOIN order_items oi ON b.book_id = oi.book_id;

Order Matters with EXCEPT

EXCEPT is not symmetric:

SQL
-- Query A EXCEPT Query B: rows in A but not in B
-- Query B EXCEPT Query A: rows in B but not in A (different result!)
SQL
-- Books not ordered (A - B)
SELECT book_id FROM books
EXCEPT
SELECT book_id FROM order_items;
 
-- Order items for deleted books (B - A) - different question!
SELECT book_id FROM order_items
EXCEPT
SELECT book_id FROM books;

Warning

MySQL doesn't support EXCEPT. Use NOT EXISTS or NOT IN instead:

SQL
-- MySQL alternative to EXCEPT
SELECT book_id, title FROM books b
WHERE NOT EXISTS (
    SELECT 1 FROM order_items oi WHERE oi.book_id = b.book_id
);

Rules for Set Operations

1. Same Number of Columns

Both queries must select the same number of columns:

SQL
-- Error: different column counts
SELECT first_name, last_name, email FROM customers
UNION
SELECT first_name, last_name FROM authors;  -- Missing column!
 
-- Fix: add NULL or a placeholder
SELECT first_name, last_name, email FROM customers
UNION
SELECT first_name, last_name, NULL AS email FROM authors;

2. Compatible Data Types

Corresponding columns must have compatible types:

SQL
-- Error: can't union INT with VARCHAR
SELECT customer_id FROM customers
UNION
SELECT email FROM customers;
 
-- Works: both are strings
SELECT first_name FROM customers
UNION
SELECT last_name FROM authors;

3. Column Names Come from First Query

The result uses column names from the first query:

SQL
SELECT first_name AS name FROM customers
UNION
SELECT last_name AS family_name FROM authors;
-- Result column is named "name", not "family_name"

Ordering Set Operation Results

ORDER BY applies to the entire combined result and must come last:

SQL
SELECT first_name, last_name, 'Customer' AS type FROM customers
UNION
SELECT first_name, last_name, 'Author' AS type FROM authors
ORDER BY last_name, first_name;  -- Sorts the combined result

Warning

You cannot ORDER BY individual queries in a UNION. Only one ORDER BY is allowed, at the very end.

Combining Multiple Set Operations

You can chain set operations:

SQL
SELECT column1 FROM table1
UNION
SELECT column1 FROM table2
UNION
SELECT column1 FROM table3
EXCEPT
SELECT column1 FROM table4;

Precedence

In standard SQL, INTERSECT has higher precedence than UNION and EXCEPT. Use parentheses for clarity:

SQL
-- Without parentheses, INTERSECT binds first
SELECT * FROM A
UNION
SELECT * FROM B
INTERSECT
SELECT * FROM C;
-- Equivalent to: A UNION (B INTERSECT C)
 
-- Use parentheses for different grouping
(SELECT * FROM A
 UNION
 SELECT * FROM B)
INTERSECT
SELECT * FROM C;

Practical Examples

Example 1: Unified Activity Feed

SQL
-- Combine different types of activities into one feed
SELECT
    order_date AS activity_date,
    'Order Placed' AS activity_type,
    CONCAT('Order #', order_id, ' - $', total_amount) AS description
FROM orders
 
UNION ALL
 
SELECT
    join_date AS activity_date,
    'New Customer' AS activity_type,
    CONCAT(first_name, ' ', last_name, ' joined') AS description
FROM customers
 
ORDER BY activity_date DESC
LIMIT 10;

Example 2: Finding Gaps

SQL
-- Find authors with no books in our catalog
SELECT author_id, first_name, last_name
FROM authors
WHERE author_id NOT IN (
    SELECT DISTINCT author_id FROM books
);
 
-- Using EXCEPT (where supported)
SELECT author_id FROM authors
EXCEPT
SELECT author_id FROM books;

Example 3: Comparing Two Time Periods

SQL
-- Customers who ordered in January but not February
SELECT DISTINCT customer_id
FROM orders
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01'
 
EXCEPT
 
SELECT DISTINCT customer_id
FROM orders
WHERE order_date >= '2024-02-01' AND order_date < '2024-03-01';

Example 4: Finding Common Customers

SQL
-- Customers who bought both Orwell books
SELECT customer_id
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.book_id = 1  -- 1984
 
INTERSECT
 
SELECT customer_id
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.book_id = 2;  -- Animal Farm

Set Operations vs. Alternatives

Often you can achieve the same result multiple ways:

UNION vs. OR

SQL
-- Using UNION
SELECT * FROM books WHERE author_id = 1
UNION
SELECT * FROM books WHERE author_id = 4;
 
-- Using OR (simpler for same table)
SELECT * FROM books WHERE author_id = 1 OR author_id = 4;
 
-- Using IN (even simpler)
SELECT * FROM books WHERE author_id IN (1, 4);

INTERSECT vs. JOIN

SQL
-- Using INTERSECT
SELECT customer_id FROM orders WHERE status = 'completed'
INTERSECT
SELECT customer_id FROM orders WHERE total_amount > 30;
 
-- Using AND (simpler for same table)
SELECT DISTINCT customer_id FROM orders
WHERE status = 'completed' AND total_amount > 30;

EXCEPT vs. NOT IN / NOT EXISTS

SQL
-- Using EXCEPT
SELECT book_id FROM books
EXCEPT
SELECT book_id FROM order_items;
 
-- Using NOT IN
SELECT book_id FROM books
WHERE book_id NOT IN (SELECT book_id FROM order_items);
 
-- Using NOT EXISTS
SELECT book_id FROM books b
WHERE NOT EXISTS (
    SELECT 1 FROM order_items oi WHERE oi.book_id = b.book_id
);

When to Use Set Operations

Use set operations when:

  • Combining results from different tables with similar structures
  • The logic is clearer with set terminology
  • You need to remove duplicates (UNION) or find common elements (INTERSECT)

Use JOINs/subqueries when:

  • Working with a single table
  • You need columns from multiple tables in the result
  • Your database doesn't support certain set operations

What's Next

The next chapter covers data generation and conversion—string manipulation, date functions, and converting between data types.