System Design Knowledge Base
A structured reference for engineers preparing for system design interviews or building production-grade distributed systems.
What's Coveredโ
| Topic | Description |
|---|---|
| Architecture Fundamentals | CAP theorem, consistency models, trade-offs |
| Capacity Planning & Estimation | Back-of-envelope math, traffic/storage estimation |
| Interview Framework | Structured approach to design interviews |
| Scaling Reads | Caching, read replicas, CDN, CQRS |
| Scaling Writes | Sharding, partitioning, write-ahead log |
| Real-Time Updates | WebSocket, SSE, polling strategies |
| Handling Contention | Locks, MVCC, optimistic concurrency |
| Large Blob Storage | Object storage, chunking, CDN delivery |
| Multi-Step Processes | Sagas, orchestration, choreography |
| Long-Running Tasks | Job queues, async patterns, progress tracking |
| Microservices Patterns | Service mesh, circuit breaker, API gateway |
| Database Design | Normalization, indexing, partitioning |
| Caching Strategies | Cache aside, write-through, eviction policies |
| Message Queues & Streaming | Kafka, RabbitMQ, pub/sub, event sourcing |
| API Design | REST, gRPC, GraphQL, versioning |
| Distributed Systems | Consensus, leader election, clock sync |
| Security Patterns | AuthN/AuthZ, rate limiting, zero trust |
| Common Interview Questions | Full question bank with discussion points |
How to Use This Guideโ
- For interviews โ Start with the Interview Framework, then study each pattern topic.
- For production systems โ Jump directly to the relevant pattern topic.
- For review โ Use the Common Interview Questions page as a self-test.
Key Principles to Internalizeโ
- There is no silver bullet โ every design choice is a trade-off.
- Identify bottlenecks first โ don't optimize prematurely.
- Consistency vs. Availability โ know which one your use case needs.
- Data is the hardest part โ compute is cheap, storage and consistency are not.
Interview Questionsโ
Q: How do you structure the first 5 minutes of a system design interview?โ
A: Clarify requirements and constraints, define scale assumptions, identify core entities and APIs, then propose a baseline architecture before deep dives.
Q: What distinguishes a senior-level system design answer from a mid-level one?โ
A: Seniors make explicit trade-offs, quantify scale, discuss failure modes, and connect design choices to operational concerns like SLOs, cost, and rollout risk.
Q: How do you decide what to design first: API, data model, or infrastructure?โ
A: Start from user flows and invariants, then model data and APIs, and finally map to infrastructure based on throughput, latency, and consistency requirements.
Q: How should you handle unknown numbers during estimation?โ
A: State assumptions clearly, use round-number math, and show sensitivity analysis to communicate how the design changes at 10x scale.
Q: What is your framework for discussing consistency trade-offs?โ
A: Identify correctness requirements per operation, classify tolerance for stale reads, and choose patterns (quorum, idempotency, saga) accordingly.
Q: How do you include reliability in an interview design without getting lost?โ
A: Cover failure domains, retries/timeouts, backpressure, graceful degradation, and observability hooks in a concise reliability pass.
Q: When do you introduce caching in the interview flow?โ
A: After baseline bottlenecks are identified. Explain cache key design, invalidation strategy, and consistency implications.
Q: How do you communicate cost-awareness in architecture decisions?โ
A: Compare options by resource profile (CPU, memory, storage, network, operations), then justify the cheapest design that still meets SLOs.