Skip to main content

Designing Data-Intensive Applications

The Big Ideas Behind Reliable, Scalable, and Maintainable Systems β€” Martin Kleppmann (O'Reilly, 2017)

What Is This Book About?​

Modern applications are not compute-intensive (CPU is rarely the bottleneck) β€” they are data-intensive. The real challenges are:

  • The volume of data
  • The complexity of data
  • The speed at which data changes

This book cuts through the buzzwords (NoSQL, Big Data, CAP theorem, eventual consistency…) and explains the engineering principles behind the tools, so you can make smart architectural decisions.


Book Structure​

The book is divided into three parts, covering 12 chapters:

πŸ“¦ Part I β€” Foundations of Data Systems​

Covers ideas that apply to any data system, whether on a single machine or a cluster.

ChapterTopic
Chapter 1Reliable, Scalable, and Maintainable Applications
Chapter 2Data Models and Query Languages
Chapter 3Storage and Retrieval
Chapter 4Encoding and Evolution

🌐 Part II β€” Distributed Data​

What happens when data is spread across multiple machines β€” for scale and fault tolerance.

ChapterTopic
Chapter 5Replication
Chapter 6Partitioning
Chapter 7Transactions
Chapter 8The Trouble with Distributed Systems
Chapter 9Consistency and Consensus

πŸ”„ Part III β€” Derived Data​

Systems that transform and combine datasets to produce new ones.

ChapterTopic
Chapter 10Batch Processing
Chapter 11Stream Processing
Chapter 12The Future of Data Systems

Who Should Read This?​

  • Backend / platform engineers who store and process data at scale
  • Software architects choosing between databases, queues, and processing frameworks
  • Technical leads who need to reason about trade-offs in distributed systems

You should be comfortable with SQL and basic backend development. Everything else is explained from first principles.


Key Themes​

Reliability   β†’  Working correctly even when things go wrong
Scalability β†’ Handling growth in data, traffic, and complexity
Maintainability β†’ Being easy to work on over time by different teams

These three properties appear in every chapter and tie the whole book together.

Quick Navigation​