Indexes and Constraints
Indexes and Constraints
Indexes speed up data retrieval. Constraints enforce data integrity. Together, they're essential for building robust, performant databases.
Understanding Indexes
An index is a data structure that speeds up row lookups. Without an index, the database scans every row (full table scan). With an index, it jumps directly to matching rows.
-- Without index: scans all rows
SELECT * FROM books WHERE title = '1984';
-- With index on title: jumps directly to matching row
CREATE INDEX idx_books_title ON books(title);
SELECT * FROM books WHERE title = '1984'; -- Much faster!Note
Indexes trade storage space and write performance for read performance. Every INSERT, UPDATE, or DELETE must also update the index.
Types of Indexes
B-Tree Index (Default)
The standard index type. Efficient for equality and range queries.
CREATE INDEX idx_books_price ON books(price);
-- These queries use the index:
SELECT * FROM books WHERE price = 15.99;
SELECT * FROM books WHERE price > 15;
SELECT * FROM books WHERE price BETWEEN 10 AND 20;
SELECT * FROM books ORDER BY price;Unique Index
Ensures all values in the indexed column(s) are unique.
CREATE UNIQUE INDEX idx_customers_email ON customers(email);
-- This will fail if duplicate exists:
INSERT INTO customers (email) VALUES ('john@example.com');
INSERT INTO customers (email) VALUES ('john@example.com'); -- Error!Composite Index (Multi-Column)
Indexes multiple columns together. Column order matters!
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
-- Uses the full index:
SELECT * FROM orders WHERE customer_id = 1 AND order_date = '2025-01-15';
-- Uses index (left prefix):
SELECT * FROM orders WHERE customer_id = 1;
-- Cannot use index (missing left column):
SELECT * FROM orders WHERE order_date = '2025-01-15';Warning
A composite index on (A, B, C) helps queries filtering on A, or A+B, or A+B+C. It does NOT help queries filtering only on B or C.
Covering Index
When an index contains all columns needed by a query, the database reads only the index:
CREATE INDEX idx_books_author_title ON books(author_id, title, price);
-- This query is "covered" - no table access needed:
SELECT title, price FROM books WHERE author_id = 1;Full-Text Index
For searching text content:
-- MySQL
CREATE FULLTEXT INDEX idx_books_description ON books(description);
SELECT * FROM books WHERE MATCH(description) AGAINST('adventure fantasy');
-- PostgreSQL
CREATE INDEX idx_books_description_gin ON books USING gin(to_tsvector('english', description));
SELECT * FROM books WHERE to_tsvector('english', description) @@ to_tsquery('adventure & fantasy');Hash Index
Fast for exact equality only (no range queries):
-- PostgreSQL
CREATE INDEX idx_customers_email_hash ON customers USING hash(email);
-- Good for:
SELECT * FROM customers WHERE email = 'john@example.com';
-- Cannot use hash index:
SELECT * FROM customers WHERE email LIKE 'john%';Partial Index
Indexes only rows matching a condition:
-- Only index active orders (much smaller index)
CREATE INDEX idx_orders_active ON orders(order_date)
WHERE status = 'pending' OR status = 'shipped';
-- Only index non-null values
CREATE INDEX idx_customers_phone ON customers(phone) WHERE phone IS NOT NULL;Index Management
Creating Indexes
-- Basic syntax
CREATE INDEX index_name ON table_name(column1, column2, ...);
-- Unique index
CREATE UNIQUE INDEX index_name ON table_name(column);
-- Specify sort order
CREATE INDEX idx_orders_date_desc ON orders(order_date DESC);Viewing Indexes
-- MySQL
SHOW INDEX FROM books;
-- PostgreSQL
SELECT * FROM pg_indexes WHERE tablename = 'books';
-- SQL Server
SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID('books');Dropping Indexes
DROP INDEX idx_books_title ON books; -- MySQL
DROP INDEX idx_books_title; -- PostgreSQL, SQL ServerQuery Optimization with EXPLAIN
EXPLAIN shows how the database executes a query:
EXPLAIN SELECT * FROM books WHERE author_id = 1;Key things to look for:
| Indicator | Good | Bad |
|---|---|---|
| Type | index, ref, eq_ref, const | ALL (full scan) |
| Key | Shows index name | NULL |
| Rows | Low number | High number |
| Extra | Using index | Using filesort, Using temporary |
-- Full table scan (bad)
EXPLAIN SELECT * FROM books WHERE description LIKE '%adventure%';
-- type: ALL, key: NULL
-- Index lookup (good)
EXPLAIN SELECT * FROM books WHERE author_id = 1;
-- type: ref, key: idx_books_authorFinding slow queries:
-- MySQL: enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1; -- Log queries over 1 second
-- PostgreSQL: check query stats
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;Indexing Best Practices
When to Create Indexes
- Primary keys (automatic)
- Foreign keys (speeds up JOINs)
- Columns in WHERE clauses (frequent filtering)
- Columns in ORDER BY (avoids sorting)
- Columns in GROUP BY (faster aggregation)
When NOT to Index
- Small tables - Full scan is often faster
- Columns with few distinct values (gender, status) - Low selectivity
- Frequently updated columns - Index maintenance overhead
- Columns rarely used in queries
Index Selectivity
Selectivity = distinct values / total rows. Higher is better.
-- High selectivity (good for indexing)
SELECT COUNT(DISTINCT email) / COUNT(*) FROM customers; -- ~1.0
-- Low selectivity (poor for indexing)
SELECT COUNT(DISTINCT status) / COUNT(*) FROM orders; -- ~0.003Constraints
Constraints enforce rules on data. They guarantee data integrity at the database level.
PRIMARY KEY
Uniquely identifies each row. Cannot be NULL.
CREATE TABLE books (
book_id INT PRIMARY KEY,
title VARCHAR(200) NOT NULL
);
-- Composite primary key
CREATE TABLE order_items (
order_id INT,
book_id INT,
quantity INT,
PRIMARY KEY (order_id, book_id)
);FOREIGN KEY
Ensures referential integrity between tables.
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);Referential Actions:
CREATE TABLE order_items (
order_id INT,
book_id INT,
FOREIGN KEY (order_id) REFERENCES orders(order_id)
ON DELETE CASCADE -- Delete items when order deleted
ON UPDATE CASCADE, -- Update items when order_id changes
FOREIGN KEY (book_id) REFERENCES books(book_id)
ON DELETE RESTRICT -- Prevent deleting book if in orders
ON UPDATE SET NULL -- Set to NULL if book_id changes
);| Action | Effect |
|---|---|
| CASCADE | Propagate change to child rows |
| RESTRICT | Prevent change if child rows exist |
| SET NULL | Set foreign key to NULL |
| SET DEFAULT | Set foreign key to default value |
| NO ACTION | Same as RESTRICT (deferred check) |
UNIQUE
Ensures all values in a column are distinct (NULLs allowed):
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
email VARCHAR(100) UNIQUE,
phone VARCHAR(20) UNIQUE
);
-- Multiple columns unique together
ALTER TABLE books ADD CONSTRAINT uk_isbn UNIQUE (isbn);NOT NULL
Prevents NULL values:
CREATE TABLE books (
book_id INT PRIMARY KEY,
title VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL
);CHECK
Validates data against a condition:
CREATE TABLE books (
book_id INT PRIMARY KEY,
price DECIMAL(10,2) CHECK (price > 0),
stock_quantity INT CHECK (stock_quantity >= 0),
publication_year INT CHECK (publication_year BETWEEN 1000 AND 2100)
);
-- Named constraint
ALTER TABLE orders
ADD CONSTRAINT chk_status
CHECK (status IN ('pending', 'shipped', 'completed', 'cancelled'));Note
MySQL ignores CHECK constraints in versions before 8.0.16. Always verify your database version supports the constraints you need.
DEFAULT
Provides a default value when none is specified:
CREATE TABLE orders (
order_id INT PRIMARY KEY,
order_date DATE DEFAULT (CURRENT_DATE),
status VARCHAR(20) DEFAULT 'pending',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO orders (order_id) VALUES (1);
-- order_date = today, status = 'pending', created_at = nowManaging Constraints
Adding Constraints
-- Add primary key
ALTER TABLE books ADD PRIMARY KEY (book_id);
-- Add foreign key
ALTER TABLE orders
ADD CONSTRAINT fk_customer
FOREIGN KEY (customer_id) REFERENCES customers(customer_id);
-- Add check constraint
ALTER TABLE books
ADD CONSTRAINT chk_price_positive
CHECK (price > 0);
-- Add unique constraint
ALTER TABLE customers
ADD CONSTRAINT uk_email UNIQUE (email);Viewing Constraints
-- MySQL
SELECT * FROM information_schema.table_constraints
WHERE table_name = 'orders';
-- PostgreSQL
SELECT conname, contype, pg_get_constraintdef(oid)
FROM pg_constraint
WHERE conrelid = 'orders'::regclass;Dropping Constraints
-- Drop foreign key
ALTER TABLE orders DROP FOREIGN KEY fk_customer;
-- Drop check constraint
ALTER TABLE books DROP CONSTRAINT chk_price_positive;
-- Drop unique constraint
ALTER TABLE customers DROP CONSTRAINT uk_email;Temporarily Disabling Constraints
-- MySQL: disable foreign key checks for bulk loading
SET FOREIGN_KEY_CHECKS = 0;
-- ... bulk insert ...
SET FOREIGN_KEY_CHECKS = 1;
-- PostgreSQL: disable triggers (includes FK checks)
ALTER TABLE orders DISABLE TRIGGER ALL;
-- ... bulk insert ...
ALTER TABLE orders ENABLE TRIGGER ALL;Warning
Disabling constraints risks data integrity. Only do this for controlled operations like data migrations, and verify data consistency afterward.
Practical Examples
Example 1: Optimizing a Slow Query
-- Before: slow query
EXPLAIN SELECT
c.first_name, c.last_name,
COUNT(*) as order_count,
SUM(o.total_amount) as total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2025-01-01'
GROUP BY c.customer_id;
-- Full table scan on orders
-- Add strategic indexes
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Or a covering composite index
CREATE INDEX idx_orders_date_customer_amount
ON orders(order_date, customer_id, total_amount);
-- After: index-based query
EXPLAIN SELECT ... -- Now uses indexExample 2: Bookstore Schema with Constraints
CREATE TABLE authors (
author_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
birth_date DATE,
nationality VARCHAR(50),
CONSTRAINT chk_birth_date CHECK (birth_date < CURRENT_DATE)
);
CREATE TABLE books (
book_id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(200) NOT NULL,
author_id INT NOT NULL,
isbn VARCHAR(13) UNIQUE,
price DECIMAL(10,2) NOT NULL CHECK (price > 0),
stock_quantity INT DEFAULT 0 CHECK (stock_quantity >= 0),
publication_year INT CHECK (publication_year BETWEEN 1000 AND YEAR(CURRENT_DATE)),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (author_id) REFERENCES authors(author_id) ON DELETE RESTRICT,
INDEX idx_author (author_id),
INDEX idx_price (price)
);
CREATE TABLE customers (
customer_id INT AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) NOT NULL UNIQUE,
phone VARCHAR(20),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_email (email),
INDEX idx_name (last_name, first_name)
);
CREATE TABLE orders (
order_id INT AUTO_INCREMENT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL DEFAULT (CURRENT_DATE),
status VARCHAR(20) NOT NULL DEFAULT 'pending',
total_amount DECIMAL(10,2) NOT NULL DEFAULT 0,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE RESTRICT,
CONSTRAINT chk_status CHECK (status IN ('pending', 'shipped', 'completed', 'cancelled')),
INDEX idx_customer (customer_id),
INDEX idx_date (order_date),
INDEX idx_status (status)
);
CREATE TABLE order_items (
order_id INT NOT NULL,
book_id INT NOT NULL,
quantity INT NOT NULL CHECK (quantity > 0),
unit_price DECIMAL(10,2) NOT NULL CHECK (unit_price >= 0),
PRIMARY KEY (order_id, book_id),
FOREIGN KEY (order_id) REFERENCES orders(order_id) ON DELETE CASCADE,
FOREIGN KEY (book_id) REFERENCES books(book_id) ON DELETE RESTRICT
);Example 3: Audit Table with Indexes
CREATE TABLE audit_log (
log_id BIGINT AUTO_INCREMENT PRIMARY KEY,
table_name VARCHAR(50) NOT NULL,
action VARCHAR(10) NOT NULL,
row_id INT NOT NULL,
old_values JSON,
new_values JSON,
changed_by VARCHAR(100),
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- Composite index for common queries
INDEX idx_table_action (table_name, action),
-- Index for time-based queries
INDEX idx_changed_at (changed_at),
-- Index for user-based queries
INDEX idx_changed_by (changed_by),
-- Covering index for dashboard query
INDEX idx_dashboard (changed_at, table_name, action)
);
-- Query patterns this supports:
SELECT * FROM audit_log WHERE table_name = 'orders' AND action = 'DELETE';
SELECT * FROM audit_log WHERE changed_at > DATE_SUB(NOW(), INTERVAL 24 HOUR);
SELECT * FROM audit_log WHERE changed_by = 'admin@example.com';
SELECT table_name, action, COUNT(*) FROM audit_log
WHERE changed_at > DATE_SUB(NOW(), INTERVAL 7 DAY)
GROUP BY table_name, action;Summary
This chapter covered:
- Indexes: B-tree, unique, composite, covering, full-text, hash, and partial indexes
- EXPLAIN: Analyzing query execution plans to identify optimization opportunities
- Constraints: PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK, DEFAULT
- Best Practices: When to index, selectivity, constraint management
Together, indexes and constraints form the foundation of a well-designed database—fast queries and guaranteed data integrity.
What's Next
The next chapter covers views—virtual tables that simplify complex queries and provide an abstraction layer over your data.