Transactions

9 min read•Designing Data-Intensive Applications

Transactions

Things go wrong in data systems. Networks fail, disks crash, software has bugs, users do unexpected things. Transactions are the mechanism databases use to provide safety guarantees in the face of these faults.

A transaction groups multiple reads and writes into a logical unit. Either all succeed together (commit) or all fail together (abort/rollback)—no partial failures.

Note

Transactions aren't free. They add overhead and can limit scalability. Understanding what guarantees you actually need helps you make informed tradeoffs.

ACID: The Safety Guarantees

ACID is the classic mnemonic for transaction properties:

Atomicity

"All or nothing." If a transaction fails partway through, any changes already made are rolled back. The database is never left in a half-finished state.

Atomicity in Action

SQL

BEGIN TRANSACTION;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
  -- Something fails here!
ROLLBACK; -- All changes are undone
 
-- Account 1 still has original balance
-- No money was lost or created

Consistency

The database moves from one valid state to another. Invariants (like "total money in the system is constant") are preserved.

Warning

"Consistency" in ACID is actually about application invariants, not the "C" in CAP theorem (which is about replicas agreeing). Confusingly different concepts!

Isolation

Concurrently executing transactions don't see each other's uncommitted changes. Each transaction runs as if it were the only one.

Durability

Once committed, data isn't lost—even if the system crashes. Usually means written to disk and/or replicated.

Single-Object vs Multi-Object Transactions

Single-Object Writes

Most databases provide atomicity and isolation for single-object operations:

Writing a 20KB JSON document: all or nothing
Incrementing a counter: atomic read-modify-write

These are implemented with:

Write-ahead log (WAL) for crash recovery
Locks or compare-and-swap for isolation

Multi-Object Transactions

Real applications often need to modify multiple objects atomically:

Inserting a row and updating a foreign key reference
Updating denormalized data across tables
Maintaining secondary indexes

Why Multi-Object Matters

SQL

-- Transfer money between accounts
BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
 
-- Without transaction: crash between updates loses money!

Isolation Levels

"Isolation" sounds absolute, but databases offer different degrees. Stronger isolation has higher overhead.

Read Uncommitted

Transactions can see uncommitted changes from other transactions (dirty reads).

Almost never useful—mostly a theoretical baseline.

Read Committed

No dirty reads: Only see committed data. No dirty writes: Can't overwrite uncommitted data.

The most basic useful isolation level. Default in PostgreSQL, Oracle, SQL Server.

Read Committed Prevents Dirty Reads

PLAINTEXT

Transaction A:              Transaction B:
UPDATE x SET val = 2;
                            SELECT val FROM x;  -- sees 1 (old value)
COMMIT;
                            SELECT val FROM x;  -- sees 2 (committed)

Repeatable Read (Snapshot Isolation)

Each transaction sees a consistent snapshot of the database from the start of the transaction. No non-repeatable reads.

Non-Repeatable Read Problem

PLAINTEXT

Transaction A:
  SELECT balance FROM accounts WHERE id = 1;  -- returns 500
 
  -- Meanwhile Transaction B:
  -- UPDATE accounts SET balance = 300 WHERE id = 1; COMMIT;
 
  SELECT balance FROM accounts WHERE id = 1;  -- returns 300 (!!)

With snapshot isolation, both reads return 500.

Serializable

The strongest level. Transactions execute as if they ran serially, one after another. No concurrency anomalies.

Implementing Snapshot Isolation

Snapshot isolation uses multi-version concurrency control (MVCC):

Each transaction gets a unique, increasing transaction ID
Each write creates a new version of the row, tagged with the transaction ID
Reads only see versions from transactions that committed before the read transaction started
Old versions are garbage collected when no transaction needs them

MVCC Versions

PLAINTEXT

Row versions for account 1:
  Version 1 (txn 100): balance = 500  [created_by=100, deleted_by=null]
  Version 2 (txn 105): balance = 400  [created_by=105, deleted_by=null]
 
Transaction 103 sees: balance = 500 (Version 1)
  (Version 2 was created by txn 105, which is > 103)
 
Transaction 110 sees: balance = 400 (Version 2)
  (Both versions exist, picks most recent before txn 110)

Preventing Lost Updates

A common concurrency problem: two transactions read, modify, and write the same data. The second write overwrites the first.

Lost Update

PLAINTEXT

Counter = 5
 
Transaction A:               Transaction B:
  read counter: 5
                               read counter: 5
  counter = 5 + 1
                               counter = 5 + 1
  write counter: 6
                               write counter: 6  -- Lost A's increment!

Result: counter = 6 instead of 7.

Solutions

Atomic operations: Use built-in atomic updates:

SQL

UPDATE counters SET value = value + 1 WHERE id = 1;

Explicit locking:

SQL

SELECT * FROM accounts WHERE id = 1 FOR UPDATE;
-- Now we have a lock; other transactions wait
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

Compare-and-set:

SQL

UPDATE wiki_pages
SET content = 'new content', version = 2
WHERE id = 123 AND version = 1;
-- Fails if version changed (someone else modified it)

Automatic detection: Database detects lost updates and aborts one transaction.

Write Skew and Phantoms

Some anomalies aren't prevented by snapshot isolation.

Write Skew

Two transactions read the same data, make decisions based on it, and write to different objects—creating an invalid state.

Write Skew: On-Call Doctors

PLAINTEXT

Invariant: At least one doctor must be on call
 
Doctors on call: Alice, Bob
 
Transaction A:                Transaction B:
  SELECT COUNT(*) WHERE         SELECT COUNT(*) WHERE
    on_call = true;               on_call = true;
  -- returns 2                  -- returns 2
 
  Alice can leave!              Bob can leave!
 
  UPDATE SET on_call = false    UPDATE SET on_call = false
    WHERE name = 'Alice';         WHERE name = 'Bob';
 
COMMIT;                         COMMIT;
 
Result: No one is on call! Invariant violated.

Each transaction saw a valid state and made a valid decision, but the combination is invalid.

Phantoms

A transaction queries for rows matching a condition, another transaction inserts/deletes rows that would change the result, and the first transaction proceeds with stale information.

Phantom: Meeting Room Booking

PLAINTEXT

Transaction A:
  SELECT * FROM bookings
    WHERE room = 'A' AND time = '10:00';
  -- No bookings found, room is free!
 
  -- Meanwhile Transaction B:
  -- INSERT INTO bookings (room, time) VALUES ('A', '10:00');
  -- COMMIT;
 
  INSERT INTO bookings (room, time) VALUES ('A', '10:00');
  -- Double booking!

The problem: Transaction A's WHERE clause matched no rows, so there was nothing to lock.

Serializable Isolation

The only way to prevent all these anomalies is serializable isolation. Three approaches:

Actual Serial Execution

Literally run transactions one at a time, on a single thread.

Sounds crazy, but works for:

Transactions that complete quickly
All data fits in memory
Write throughput of a single CPU is sufficient

VoltDB and Redis use this approach.

Note

To make serial execution fast, avoid interactive transactions. Submit the entire transaction as a stored procedure.

Two-Phase Locking (2PL)

The traditional approach:

Readers block writers, writers block readers
Acquire all locks during transaction
Release all locks at commit

Problem: Prone to deadlocks and poor performance.

Predicate locks: Lock all rows matching a condition (solves phantoms). Index-range locks: Lock a broader range for efficiency.

Serializable Snapshot Isolation (SSI)

An optimistic approach:

Execute transactions using snapshot isolation (no blocking)
Track reads and writes
At commit, check for conflicts
Abort and retry if conflicts detected

Advantage: Better performance than 2PL when conflicts are rare. Used by: PostgreSQL's SERIALIZABLE level, FoundationDB.

Distributed Transactions

When data spans multiple partitions or databases, transactions become harder.

Two-Phase Commit (2PC)

A protocol for atomic commit across multiple nodes:

Phase 1 (Prepare):

Coordinator asks all participants: "Can you commit?"
Each participant writes to durable storage and responds "yes" or "no"
If any says "no," abort everywhere

Phase 2 (Commit/Abort):

Coordinator decides (commit if all said yes)
Writes decision to durable log
Sends decision to all participants
Participants commit or abort

Two-Phase Commit

PLAINTEXT

Coordinator                  Participant A    Participant B
    │                             │                │
    ├──── PREPARE ───────────────►│                │
    ├──── PREPARE ────────────────────────────────►│
    │                             │                │
    │◄─── YES ───────────────────│                │
    │◄─── YES ────────────────────────────────────│
    │                             │                │
    │  (write commit to log)      │                │
    │                             │                │
    ├──── COMMIT ────────────────►│                │
    ├──── COMMIT ─────────────────────────────────►│
    │                             │                │
    │◄─── ACK ───────────────────│                │
    │◄─── ACK ────────────────────────────────────│

Warning

If the coordinator fails after sending PREPARE but before sending the final decision, participants are stuck holding locks indefinitely. 2PC has serious availability implications.

Summary

Transactions provide crucial safety guarantees:

Property	Guarantee
Atomicity	All or nothing
Consistency	Valid state to valid state
Isolation	Transactions don't interfere
Durability	Committed data persists

Isolation levels trade safety for performance:

Level	Dirty Read	Non-Repeatable Read	Phantom	Lost Update	Write Skew
Read Uncommitted	✗	✗	✗	✗	✗
Read Committed	✓	✗	✗	✗	✗
Snapshot	✓	✓	✗	Some	✗
Serializable	✓	✓	✓	✓	✓

✓ = prevented, ✗ = possible

Serializable isolation approaches:

Approach	Blocking	Best For
Serial execution	N/A	Fast transactions, in-memory data
Two-phase locking	Writers block readers	Traditional RDBMS
SSI (optimistic)	No	Read-heavy, few conflicts

Note

Many applications don't need serializable isolation. Understanding your actual requirements helps you choose the right tradeoffs.