Designing Data-Intensive Applications
by Martin Kleppmann
A practical guide to building reliable, scalable, and maintainable data systems. Learn the fundamental principles behind databases, distributed systems, and data processing architectures.
Last updated: 2026-02-05
Chapters
This guide covers the following chapters. Work through them in order for the best learning experience.
Reliable, Scalable, and Maintainable Applications
The foundational principles for building data-intensive applications
Data Models and Query Languages
Relational vs document models, graph databases, and query paradigms
Storage and Retrieval
How databases store data and how to find it efficiently
Encoding and Evolution
Data formats, schema evolution, and compatibility
Replication
Keeping copies of data on multiple machines
Partitioning
Splitting data across multiple machines
Transactions
Guarantees in the presence of faults and concurrency
The Trouble with Distributed Systems
Understanding what can go wrong in distributed systems
Consistency and Consensus
Achieving agreement in distributed systems
Batch Processing
Processing large volumes of data efficiently
Stream Processing
Processing data as it arrives in real-time
The Future of Data Systems
Bringing it all together and looking ahead