Database Patterns for Microservices
Database-Per-Service Patternโ
Each microservice owns its own database โ no shared schema.
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Order Service โ โ Inventory Svc โ โ Payment Svc โ
โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ
โ โ orders DB โ โ โ โ inv DB โ โ โ โ pay DB โ โ
โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
Why:
- Independent deployability (schema changes don't affect other services)
- Technology freedom (each service picks the best DB for its needs)
- Fault isolation (one DB down doesn't affect others)
- Independent scaling
Challenge: cross-service data consistency โ no distributed ACID transactions.
Transactional Outbox Patternโ
Problem: Atomically update the DB and publish an event to a message broker.
Naive approach (broken):
// โ NOT atomic โ network failure between steps loses the event
orderRepository.save(order); // Step 1: DB commit
messageQueue.publish(orderEvent); // Step 2: Kafka publish โ could fail!
The standard solution to this is the Transactional Outbox Pattern. For a complete guide with code examples, polling vs CDC (Debezium) trade-offs, and failure mitigation, see the dedicated Transactional Outbox Pattern Guide.
Saga Patternโ
To coordinate multi-step workflows across microservice boundaries without blocking database resources or relying on fragile distributed transactions (like 2PC), use the Saga Pattern (via Choreography or Orchestration).
For a complete guide including a detailed Orchestration vs. Choreography comparison matrix, compensating transaction playbooks, idempotency strategies, and Java entity/orchestrator implementations, see the dedicated Saga Pattern Guide.
CQRS โ Command Query Responsibility Segregationโ
For a comprehensive guide on separating read and write models, synchronization via Domain Events, and Event Sourcing theory, see the centralized CQRS & Event Sourcing page.
Event Sourcingโ
Instead of storing current state, store the sequence of events that led to it.
Traditional: store current state
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Order: id=1, status=shipped, โ
โ total=99.90, items=[...] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Event Sourcing: store events
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OrderCreated {id:1, user:42, items:[...]} โ
โ PaymentApplied {amount:99.90, method:card} โ
โ ItemsShipped {tracking:XYZ123, date:2024-01-15} โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Current state = replay all events (or from last snapshot)
@Entity
public class EventStoreRecord {
@Id @GeneratedValue
private Long id;
private String aggregateId;
private String aggregateType;
private Long version;
private String eventType;
@Convert(converter = JsonConverter.class)
private Map<String, Object> payload;
private Instant occurredAt;
}
// Reconstitute aggregate
public Order load(String orderId) {
List<EventStoreRecord> events = eventStore
.findByAggregateIdOrderByVersion(orderId);
Order order = new Order();
events.forEach(e -> order.apply(e)); // replay events
return order;
}
Benefits:
- Full audit log built-in
- Temporal queries ("what was the state on Jan 15?")
- Event-driven architecture natural fit
- CQRS natural companion
Challenges:
- Querying current state requires projection (CQRS read model)
- Schema evolution of events is hard
- High volume โ large event stores (use snapshots)
API Composition vs Database Joinโ
When you need data from multiple services:
โ Don't: SELECT o.*, u.* FROM orders o JOIN users u ON ...
(services own their own DBs โ no cross-DB joins)
โ
Do: API Gateway or BFF aggregates:
1. GET /orders โ OrderService returns orders
2. GET /users/{ids} โ UserService returns user details
3. Merge in application layer
Or use a read-side projection that subscribes to both services' events and builds a pre-joined view.
Dual Write Problem & Solutionsโ
// โ Dual write race condition
db.save(entity); // succeeds
kafka.publish(event); // fails โ event lost, DB has data, Kafka doesn't
Solutions:
- Transactional Outbox (recommended)
- CDC with Debezium โ read DB changelog, publish to Kafka
- Event-First โ publish to Kafka first, DB write on consumption
- Change Data Capture โ treat DB as source of truth, derive events
๐ฏ Interview Questionsโ
Q1. What is the database-per-service pattern and why is it used in microservices?
Each service owns its own database with no shared schema. This enables independent deployment (schema changes are local), technology diversity (each service uses the best DB for its needs), fault isolation, and independent scaling. The trade-off is that cross-service consistency requires patterns like Saga and Outbox instead of distributed transactions.
Q2. What problem does the Transactional Outbox pattern solve?
It solves the dual-write problem: atomically updating the database AND publishing an event to a message broker. By writing the event to an outbox table in the same local transaction as the business data, you guarantee both succeed or both fail. A relay process (polling or CDC) then publishes outbox events to the broker asynchronously.
Q3. What is the Saga pattern? When would you use choreography vs orchestration?
A Saga breaks a distributed transaction into a sequence of local transactions with compensating transactions for rollback. Choreography: services react to domain events โ loose coupling but hard to trace. Orchestration: a central coordinator directs all steps โ easier to reason about, clearer visibility. Use choreography for simple flows; orchestration for complex multi-step processes.
Q4. What is CQRS and what problem does it solve?
Command Query Responsibility Segregation separates the write model (handling commands, enforcing business rules) from the read model (optimized for queries). Solves the mismatch between complex write logic (requiring normalized, consistent data) and read needs (requiring denormalized, prejoined data for performance). Enables independent scaling of reads and writes.
Q5. What is Event Sourcing and what are its trade-offs?
Event Sourcing stores domain events rather than current state โ current state is derived by replaying events. Benefits: full audit log, temporal queries, event-driven integration, natural CQRS fit. Trade-offs: complex to query (requires projections/CQRS), event schema evolution is hard, large event stores need snapshots, high learning curve.
Q6. How do you handle cross-service queries (e.g., "list orders with user details") in microservices?
Options: API composition in a BFF/gateway (call both services, merge in memory); CQRS read-side projection (event subscriber builds a pre-joined view in its own DB); API Gateway pattern. Never do cross-database JOINs directly โ each service's DB is its private implementation detail.
Q7. What is Change Data Capture (CDC) and how does Debezium work?
CDC captures row-level changes from the database's transaction log (WAL for PostgreSQL, binlog for MySQL) without modifying the application. For a complete deep dive, including schema evolution and snapshotting challenges, see the Change Data Capture (CDC) page.
Q8. What is the difference between eventual consistency and strong consistency in microservices?
Strong consistency: every read sees the latest write immediately โ achieved within a single database with ACID transactions, but impossible across distributed services without coordination cost. Eventual consistency: all replicas/services will converge to the same state eventually (after event propagation) โ cheaper, more available, but reads may be stale. Design systems to tolerate eventual consistency; use strong consistency only where business rules require it.
Advanced Editorial Pass: Data Patterns for Service Autonomy and Consistencyโ
Senior Engineering Focusโ
- Choose outbox, saga, and CQRS patterns by failure recovery requirements.
- Keep data ownership boundaries explicit across services.
- Design event contracts for replayability and backward compatibility.
Failure Modes to Anticipateโ
- Cross-service coupling through shared database shortcuts.
- Compensation workflows that fail to restore business invariants.
- Event ordering assumptions broken during retries and replays.
Practical Heuristicsโ
- Define consistency model per business process.
- Test saga and outbox behavior under duplicate and delayed events.
- Instrument end-to-end flow with correlation IDs and lag metrics.