Skip to main content

System Design Knowledge Base

A structured reference for engineers preparing for system design interviews or building production-grade distributed systems.

What's Coveredโ€‹

TopicDescription
Architecture FundamentalsCAP theorem, consistency models, trade-offs
Capacity Planning & EstimationBack-of-envelope math, traffic/storage estimation
Interview FrameworkStructured approach to design interviews
Scaling ReadsCaching, read replicas, CDN, CQRS
Scaling WritesSharding, partitioning, write-ahead log
Real-Time UpdatesWebSocket, SSE, polling strategies
Handling ContentionLocks, MVCC, optimistic concurrency
Large Blob StorageObject storage, chunking, CDN delivery
Multi-Step ProcessesSagas, orchestration, choreography
Long-Running TasksJob queues, async patterns, progress tracking
Microservices PatternsService mesh, circuit breaker, API gateway
Database DesignNormalization, indexing, partitioning
Caching StrategiesCache aside, write-through, eviction policies
Message Queues & StreamingKafka, RabbitMQ, pub/sub, event sourcing
API DesignREST, gRPC, GraphQL, versioning
Distributed SystemsConsensus, leader election, clock sync
Security PatternsAuthN/AuthZ, rate limiting, zero trust
Common Interview QuestionsFull question bank with discussion points

How to Use This Guideโ€‹

  1. For interviews โ€” Start with the Interview Framework, then study each pattern topic.
  2. For production systems โ€” Jump directly to the relevant pattern topic.
  3. For review โ€” Use the Common Interview Questions page as a self-test.

Key Principles to Internalizeโ€‹

  • There is no silver bullet โ€” every design choice is a trade-off.
  • Identify bottlenecks first โ€” don't optimize prematurely.
  • Consistency vs. Availability โ€” know which one your use case needs.
  • Data is the hardest part โ€” compute is cheap, storage and consistency are not.

Interview Questionsโ€‹

Q: How do you structure the first 5 minutes of a system design interview?โ€‹

A: Clarify requirements and constraints, define scale assumptions, identify core entities and APIs, then propose a baseline architecture before deep dives.

Q: What distinguishes a senior-level system design answer from a mid-level one?โ€‹

A: Seniors make explicit trade-offs, quantify scale, discuss failure modes, and connect design choices to operational concerns like SLOs, cost, and rollout risk.

Q: How do you decide what to design first: API, data model, or infrastructure?โ€‹

A: Start from user flows and invariants, then model data and APIs, and finally map to infrastructure based on throughput, latency, and consistency requirements.

Q: How should you handle unknown numbers during estimation?โ€‹

A: State assumptions clearly, use round-number math, and show sensitivity analysis to communicate how the design changes at 10x scale.

Q: What is your framework for discussing consistency trade-offs?โ€‹

A: Identify correctness requirements per operation, classify tolerance for stale reads, and choose patterns (quorum, idempotency, saga) accordingly.

Q: How do you include reliability in an interview design without getting lost?โ€‹

A: Cover failure domains, retries/timeouts, backpressure, graceful degradation, and observability hooks in a concise reliability pass.

Q: When do you introduce caching in the interview flow?โ€‹

A: After baseline bottlenecks are identified. Explain cache key design, invalidation strategy, and consistency implications.

Q: How do you communicate cost-awareness in architecture decisions?โ€‹

A: Compare options by resource profile (CPU, memory, storage, network, operations), then justify the cheapest design that still meets SLOs.