Skip to main content

Microservices Design Patterns


When to Use Microservicesโ€‹

Use microservices when:

  • Different services have vastly different scaling needs
  • Teams are large and need independent deployment
  • Different services need different tech stacks

Stick with monolith when:

  • Small team (< 10 engineers)
  • Early-stage startup (complexity kills speed)
  • Domain isn't well-understood yet

"Start with a monolith, split when you feel the pain." โ€” Martin Fowler


API Gateway Patternโ€‹

Reverse Proxy vs. Load Balancer vs. API Gateway

API Gateways are often confused with reverse proxies and load balancers. To see a detailed comparison of their roles and how they coexist in a production topology, check out the Reverse Proxy vs. Load Balancer vs. API Gateway Guide.

The Problem: In a microservices architecture, a client application frequently needs to consume data from dozens of distinct services to render a single screen. If the client communicates directly with each service, it leads to chatty communication, tight coupling to backend infrastructure, and security nightmares (since every internal service must be exposed to the public internet and handle its own authentication).

The Solution: Implement an API Gateway as the single, unified entry point for all client requests. It acts as a highly resilient reverse proxy, routing requests to appropriate backend services and aggregating results.

Key Responsibilitiesโ€‹

  • Security & Gateway Offloading: SSL termination, token validation, and IP allowlisting at the edge so backend services remain securely hidden in private subnets.
  • Routing & Composition: Fan-out requests to multiple services concurrently, aggregate the responses, and prune out internal data to reduce round-trips over slow mobile networks.
  • Cross-Cutting Concerns: Centralized rate limiting, global caching, distributed trace ID generation, and CORS management.

Backend for Frontend (BFF) Variantโ€‹

Instead of a single monolithic API Gateway for all clients, the BFF pattern uses multiple smaller gateways tailored to specific client form factors (e.g., one BFF for the iOS app, one BFF for the Web Portal). This prevents the master API Gateway from becoming a bloated bottleneck and allows individual frontend client teams to own and iterate on their specific gateway.

Advantages & Disadvantagesโ€‹

AdvantagesDisadvantages
Encapsulation: Hides the internal structure of the application from clients. Clients don't need to know if an endpoint is powered by 1 service or 10.Single Point of Failure: If the gateway goes down, the entire application becomes inaccessible. It must be highly available.
Reduced Chatter: Aggregating data at the gateway significantly reduces the number of network round-trips for mobile clients.Latency Bottleneck: Adds an extra network hop and potential processing overhead to every single request.
Centralized Governance: A single place to enforce authentication, rate limiting, and standard observability headers.Deployment Bottleneck: A single massive gateway can become a tight coupling point where multiple teams step on each other's toes to deploy routing rules.
  • Spring Cloud Gateway: Java/Spring-based, highly customizable, uses non-blocking Netty.
  • Kong API Gateway: Nginx-based, extremely fast, highly extensible via Lua plugins.
  • AWS API Gateway: fully managed serverless proxy, natively deeply integrated with AWS Lambda and IAM.
  • Traefik / NGINX: Standard highly performant reverse proxies.
Mobile โ•ฎ
Web โ”œโ†’ API Gateway โ†’ Auth โ†’ Rate Limit โ†’ Route to Service
Partnersโ•ฏ
// Spring Cloud Gateway
@Configuration
public class GatewayConfig {
@Bean
public RouteLocator routes(RouteLocatorBuilder builder) {
return builder.routes()
.route("user-service", r -> r
.path("/api/users/**")
.filters(f -> f
.stripPrefix(1)
.addRequestHeader("X-Internal-Source", "gateway")
.circuitBreaker(c -> c.setName("user-cb").setFallbackUri("forward:/fallback"))
)
.uri("lb://user-service") // Load-balanced via Eureka
)
.route("order-service", r -> r
.path("/api/orders/**")
.filters(f -> f.requestRateLimiter(rl -> rl
.setRateLimiter(redisRateLimiter())
.setKeyResolver(userKeyResolver())
))
.uri("lb://order-service")
)
.build();
}
}

Circuit Breaker Patternโ€‹

The Problem: When one microservice synchronously calls another over a network, network glitches or heavy downstream load can cause timeouts. If Service A calls a struggling Service B, Service A's worker threads block while waiting. Eventually, Service A runs completely out of threads responding to incoming requests, causing a cascading failure that rips across the entire distributed system.

The Solution: Wrap remote calls in a Circuit Breaker object, which actively monitors for failures and prevents cascading collapse.

How It Worksโ€‹

The circuit breaker operates in three distinct states, acting as an automated electrical safeguard:

CLOSED (normal) โ†’ failures exceed threshold โ†’ OPEN (reject all)
โ†“ after timeout
HALF-OPEN (test few requests)
โ†“ success โ†’ CLOSED
โ†“ failure โ†’ OPEN again
  • CLOSED (Normal): Requests flow freely. The circuit breaker counts consecutive failures or timeouts.
  • OPEN (Failing): If the failure/slowness rate exceeds a configured threshold, the circuit "trips". All subsequent calls immediately fail fast (throwing a CallNotPermittedException or returning a fallback) without attempting the network call. This completely lifts the load off the struggling downstream service, giving it breathing room to recover.
  • HALF-OPEN (Testing): After a predefined cooldown period, the circuit allows a small number of probing test requests through. If they succeed, the circuit resets to CLOSED. If they fail, it trips back to OPEN.

Best Practicesโ€‹

  • Graceful Fallbacks: Always provide a logical fallback method. Return a sensible default value, an empty list, a cached stale response, or a simplified UI model so the user barely notices the outage.
  • Low Timeouts: Circuit breakers must be paired with aggressive HTTP timeouts. Don't wait 30 seconds for a doomed request to fail.
// Resilience4j Circuit Breaker with Spring Boot
@CircuitBreaker(name = "inventoryService", fallbackMethod = "inventoryFallback")
@TimeLimiter(name = "inventoryService")
@Retry(name = "inventoryService")
public CompletableFuture<InventoryResponse> checkInventory(Long itemId) {
return CompletableFuture.supplyAsync(() ->
inventoryClient.check(itemId)
);
}

public CompletableFuture<InventoryResponse> inventoryFallback(Long itemId, Exception ex) {
log.warn("Inventory service unavailable, using fallback for item {}", itemId);
return CompletableFuture.completedFuture(
InventoryResponse.assumeAvailable(itemId) // Graceful degradation
);
}
# application.yml
resilience4j:
circuitbreaker:
instances:
inventoryService:
registerHealthIndicator: true
slidingWindowSize: 10
minimumNumberOfCalls: 5
permittedNumberOfCallsInHalfOpenState: 3
waitDurationInOpenState: 10s
failureRateThreshold: 50
retry:
instances:
inventoryService:
maxAttempts: 3
waitDuration: 500ms
enableExponentialBackoff: true

Bulkhead Patternโ€‹

The Problem: If a service uses a single shared thread pool or connection pool to execute all outgoing network requests, a single slow downstream dependency will exhaust the entire pool. For example, if the Payment Integration is experiencing severe lag, all available application threads will eventually get stuck blocking on the Payment call. This leaves zero threads available to process incoming requests for completely unrelated, perfectly healthy endpoints (like viewing a user profile entirely read from a local cache).

The Solution: Isolate failure domains by partitioning system resources. The name comes from shipbuilding: a ship's hull is divided into isolated watertight compartments (bulkheads). If one compartment gets punctured and floods, the water is contained strictly to that section, preventing the entire ship from sinking.

By partitioning threads, memory, or connection pools, a catastrophic failure in one integration only exhausts the resources allocated specifically to that partition. The rest of the application remains highly responsive.

Types of Bulkheadsโ€‹

  • Thread Pool Bulkhead: Assigns dedicated, physically isolated thread pools to specific downstream services. If the 10 threads allocated for PaymentService fill up, the 50 threads allocated for ProductCatalog continue operating flawlessly. Preferred for synchronous/blocking calls, though it introduces context-switching thread overhead.
  • Semaphore Bulkhead: Uses atomic counters (semaphores) to limit the number of concurrent requests to a specific service, executing within the existing caller thread. Extremely lightweight and preferred for non-blocking reactive architectures.
// Resilience4j Bulkhead
@Bulkhead(name = "paymentService", type = Bulkhead.Type.THREADPOOL)
public CompletableFuture<PaymentResult> charge(PaymentRequest req) {
return CompletableFuture.supplyAsync(() -> paymentGateway.charge(req));
}
resilience4j:
thread-pool-bulkhead:
instances:
paymentService:
maxThreadPoolSize: 10 # Isolated pool for payment
coreThreadPoolSize: 5
queueCapacity: 100

Service Discoveryโ€‹

Client-Side (Eureka)โ€‹

// Register service
@SpringBootApplication
@EnableEurekaClient
public class InventoryServiceApplication { ... }

// Discover and call
@LoadBalanced
@Bean
public RestTemplate restTemplate() { return new RestTemplate(); }

// Usage โ€” service name resolved by Eureka
restTemplate.getForObject("http://inventory-service/items/{id}", Item.class, id);

Server-Side (Kubernetes)โ€‹

  • K8s DNS: inventory-service.default.svc.cluster.local
  • No client-side library needed โ€” handled by kube-proxy

Sidecar Patternโ€‹

Attach a proxy container to each service for cross-cutting concerns.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Pod โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Your Service โ”‚โ†โ†’โ”‚ Envoy โ”‚โ†โ”€โ”€โ”€โ”€ Observability, mTLS, retries
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Proxy โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Used by: Istio (Envoy sidecar), Linkerd, Dapr.


Service Meshโ€‹

Automates service-to-service communication: retries, timeouts, mTLS, load balancing, observability.

FeatureWithout Service MeshWith Service Mesh (Istio)
mTLSManual cert managementAutomatic
RetriesIn every serviceCentralized policy
Traffic splittingManual deploymentVirtualService rules
ObservabilityManual instrumentationAutomatic traces/metrics
# Istio VirtualService โ€” canary deployment
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: user-service
spec:
http:
- route:
- destination:
host: user-service
subset: v1
weight: 90
- destination:
host: user-service
subset: v2 # New version
weight: 10

Strangler Fig Patternโ€‹

Incrementally migrate a monolith to microservices.

Phase 1: Monolith handles all traffic
Phase 2: New service handles feature X โ†’ Route /feature-x to microservice
Phase 3: Expand โ€” route more features to microservices
Phase 4: Monolith retired

Beginner Viewโ€‹

The Strangler pattern avoids a risky "big-bang" rewrite. New features are built in services, and selected old endpoints are routed away from the monolith over time.

Senior Deep Diveโ€‹

Use an explicit migration blueprint:

  1. Edge routing layer (gateway/proxy) decides old vs new path
  2. Anti-corruption layer (ACL) shields new services from monolith schema/domain leaks
  3. Contract tests ensure old/new behavior parity
  4. Cutover metrics define when a route can be fully switched
Client -> Gateway
|- /orders/* -> New Order Service
|- /legacy/* -> Monolith

Migration Playbookโ€‹

  • Phase 0: Identify bounded context seams and ownership
  • Phase 1: Route read-only endpoints first (lower blast radius)
  • Phase 2: Migrate writes with idempotency and outbox events
  • Phase 3: Decommission monolith modules after zero-traffic burn-in

Failure Modesโ€‹

  • Shared DB coupling keeps hidden runtime dependencies
  • Cross-service transactions recreated as synchronous chains
  • Incomplete parity checks create data drift after cutover

Operational Guardrailsโ€‹

  • Keep rollback switch at gateway level
  • Track golden metrics per migrated route: error rate, latency, data mismatch
  • Run dual-read/compare for critical paths before full write cutover

Saga Pattern (Cross-Service Transactions)โ€‹

See the dedicated Saga Pattern Guide for full coverage.


Event-Driven Microservicesโ€‹

Service A (producer) โ†’ Kafka โ†’ Service B (consumer)
โ†’ Service C (consumer)

Domain Eventsโ€‹

// Publish domain events via Spring ApplicationEventPublisher
@Entity
public class Order extends AbstractAggregateRoot<Order> {
public Order complete() {
this.status = COMPLETED;
registerEvent(new OrderCompletedEvent(this)); // Collected by Spring
return this;
}
}

// Spring Data auto-publishes events on save
orderRepository.save(order); // โ†’ OrderCompletedEvent fired

// Listen in another service via Kafka
@KafkaListener(topics = "order-completed")
public void onOrderCompleted(OrderCompletedEvent event) {
inventoryService.releaseReservation(event.getOrderId());
}

Distributed Tracingโ€‹

Track requests across services.

// Spring Boot + Micrometer + Zipkin/Jaeger
// Auto-propagates trace/span IDs via HTTP headers
// No manual code needed with Spring Cloud Sleuth

// Logs automatically include traceId, spanId
// [traceId=abc123, spanId=def456] Processing order 789
spring:
sleuth:
sampler:
probability: 1.0 # 100% sampling (reduce in prod)
zipkin:
base-url: http://zipkin:9411

Anti-Patterns to Avoidโ€‹

Anti-PatternProblemSolution
Distributed monolithServices tightly coupled, deployed togetherDefine clear bounded contexts
Shared DBServices share same schemaEach service owns its data
Chatty servicesMany fine-grained calls per requestAggregate API or BFF pattern
No API versioningBreaking changes kill consumersVersion APIs from day 1
Synchronous chainAโ†’Bโ†’Cโ†’D, one failure kills allBreak with async events

Retry Patternโ€‹

// Resilience4j Retry
@Bean
public RetryConfig retryConfig() {
return RetryConfig.custom()
.maxAttempts(3)
.waitDuration(Duration.ofMillis(500))
.retryExceptions(ConnectTimeoutException.class, IOException.class)
.ignoreExceptions(BadRequestException.class) // don't retry 4xx
.build();
}

// Exponential backoff with jitter
RetryConfig.custom()
.intervalFunction(IntervalFunction.ofExponentialRandomBackoff(
Duration.ofMillis(200), // initial
2.0, // multiplier
Duration.ofSeconds(10))) // max
.build();

Envoy Proxyโ€‹

Envoy is the data plane proxy used by Istio, AWS App Mesh, and many others.

Envoy capabilities:
L3/L4: TCP proxy, TLS termination/origination
L7: HTTP/1.1, HTTP/2, gRPC, WebSocket
Observability: distributed tracing (Zipkin, Jaeger, X-Ray), stats
Service discovery: via xDS API from control plane
Load balancing: round-robin, least-request, ring hash, Maglev
Fault injection: inject delays and errors for testing

Kubernetes Networking Conceptsโ€‹

Pod networking:
Every pod gets its own IP (flat network)
Pods can reach each other directly across nodes
CNI plugin handles this (Calico, Flannel, Cilium, Weave)

Service types:
ClusterIP: internal-only VIP, reachable within cluster
NodePort: exposes on every node's IP:port (30000-32767)
LoadBalancer: provisions cloud LB (AWS ELB, GCP NLB)
ExternalName: maps to external DNS name

Ingress:
L7 HTTP routing โ†’ backend Services
nginx Ingress, Traefik, AWS ALB Ingress Controller

## Interview Questions


### Q: What is an API Gateway? What responsibilities should it have?

**A:** An API gateway is the entry point that centralizes cross-cutting concerns for external traffic. It should handle routing, auth, rate limiting, request shaping, and observability, not domain business logic.

### Q: Explain the Circuit Breaker pattern. What are its states?

**A:** Circuit breaker stops calling unhealthy dependencies after failures exceed a threshold. States are closed (normal), open (fail fast), and half-open (probe recovery).

### Q: How does service discovery work in a microservices environment?

**A:** Instances register their endpoints in a registry or are discovered via platform DNS. Clients or sidecars resolve healthy endpoints dynamically to avoid hardcoded addresses.

### Q: What is a service mesh and what problems does it solve?

**A:** A service mesh moves retries, mTLS, traffic policy, and telemetry to sidecars/control plane. It standardizes service-to-service networking without rewriting each app.

### Q: How would you migrate a monolith to microservices?

**A:** Start with domain boundaries, extract one vertical slice, and route traffic gradually using strangler pattern. Keep contracts explicit and use observability + rollback at every cutover.

### Q: What is the Bulkhead pattern and how does it prevent cascading failures?

**A:** Bulkheads isolate resources like thread pools/queues per dependency. A failing downstream then consumes only its partition and cannot starve unrelated traffic.

### Q: How do you handle data consistency when each microservice has its own database?

**A:** Prefer local ACID per service and coordinate cross-service workflows with saga/outbox patterns. Accept eventual consistency and design idempotent handlers plus compensation.

### Q: What is the difference between orchestration and choreography in microservices?

**A:** Orchestration uses a central coordinator to direct steps; choreography relies on event reactions among services. Orchestration improves flow visibility, while choreography reduces central coupling.

### Q: How do you implement distributed tracing in a Spring Boot microservices system?

**A:** Propagate W3C trace context across HTTP/Kafka and instrument via OpenTelemetry auto/manual spans. Export to Jaeger/Tempo/Zipkin and correlate with logs/metrics.

### Q: What are the signs that you should NOT be using microservices?

**A:** Small team, unstable domain boundaries, and low scale usually favor a modular monolith. If ops overhead dominates feature delivery, microservices are premature.


### Microservices Networking & Mesh Questions


**Q1. What is a circuit breaker and why is it needed in microservices?**
> A circuit breaker prevents cascading failures: if Service A calls Service B and B is slow/down, without a circuit breaker, A's threads fill up waiting for B's timeouts โ€” eventually A becomes unavailable too. The circuit breaker opens after N failures, immediately failing calls with a fallback (rather than waiting for timeout). After a recovery period, it lets test calls through โ€” if successful, closes and resumes normal operation.

**Q2. What is a service mesh and what problems does it solve?**
> A service mesh adds a sidecar proxy (Envoy) to every pod, intercepting all network traffic. It moves cross-cutting concerns out of application code: automatic mTLS between services (zero-trust), observability (distributed tracing, metrics without code changes), traffic management (retries, circuit breaking, timeouts), canary deployments, and authorization policies. The app just speaks plain HTTP โ€” the sidecar handles everything.

**Q3. What is the difference between client-side and server-side service discovery?**
> Client-side: the service queries a registry (Eureka, Consul) to get a list of healthy instances and load-balances among them. Client needs registry client library. Server-side: the client sends to a stable address (Kubernetes Service ClusterIP), and the infrastructure (kube-proxy, load balancer) routes to a healthy instance. Client has no discovery logic. Kubernetes uses server-side discovery โ€” DNS resolves to ClusterIP, kube-proxy routes to pods.

**Q4. What is the bulkhead pattern?**
> Named after ship compartments, bulkheads isolate resources per dependency. Each downstream service gets its own thread pool (or semaphore limit). If one service is slow and exhausts its thread pool, other services' thread pools are unaffected โ€” the failure is contained. Without bulkheads, one slow service can consume all application threads, bringing down all other endpoints.

**Q5. How does Istio implement mTLS without changing application code?**
> Istio's control plane (Istiod) automatically provisions X.509 certificates for every service account. The Envoy sidecar intercepts all inbound/outbound traffic โ€” it terminates incoming mTLS and initiates outgoing mTLS, transparently to the application. The app speaks plain HTTP to the sidecar on localhost. The sidecar upgrades to mTLS for inter-service calls. Certificate rotation is also automatic.

**Q6. What is a Kubernetes ClusterIP service and how does kube-proxy route traffic to it?**
> A ClusterIP is a virtual IP (VIP) โ€” it doesn't correspond to any actual network interface. kube-proxy watches the Kubernetes API and programs iptables (or IPVS) rules on every node: packets destined for the ClusterIP:port are DNAT'd to a randomly selected healthy pod IP:port. This happens in the Linux kernel before the packet reaches any application, with no extra network hops.

**Q7. What is a canary deployment in the context of a service mesh?**
> A canary deployment gradually routes a small percentage of traffic to a new version of a service while the rest continues to the stable version. Istio VirtualService weight routing allows this: `v1: 95%, v2: 5%`. Monitor v2's error rate and latency. If healthy, increase to 20%, 50%, 100%. If problems appear, instantly route 100% back to v1. The service mesh makes this seamless โ€” no DNS changes, no infrastructure changes, just a YAML update.

**Q8. What is the difference between retry and circuit breaker patterns?**
> Retries handle **transient failures** โ€” try again immediately or with backoff, hoping the next attempt succeeds (network blip, momentary unavailability). Circuit breakers handle **sustained failures** โ€” stop trying when a service is clearly down, instead failing fast and returning a fallback immediately. They work together: retry handles flickers; circuit breaker trips when retries consistently fail, preventing retry storms from overwhelming a struggling service.

---

## Deployment Strategies: Zero-Downtime Releases

:::info[Chapter 8 Reference]
Building Microservices dedicates substantial coverage to deployment strategies, noting that independent deployability is one of the key benefits of microservices โ€” but it requires sophisticated deployment infrastructure.
:::

### Blue-Green Deployment

Blue (current production) โ†โ”€โ”€ 100% traffic Green (new version) โ†โ”€โ”€ 0% traffic (being tested)

Switch: Blue โ†โ”€โ”€ 0% traffic Greenโ†โ”€โ”€ 100% traffic

Rollback: instant (just flip back) Requires: 2x infrastructure


```yaml
# Kubernetes blue-green via service selector swap
apiVersion: v1
kind: Service
metadata:
name: order-service
spec:
selector:
app: order-service
version: blue # โ† change to "green" to switch
ports:
- port: 80
targetPort: 8080
---
# Blue deployment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-blue
spec:
replicas: 5
selector:
matchLabels:
app: order-service
version: blue
---
# Green deployment (new version, deployed before cutover)
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-green
spec:
replicas: 5
selector:
matchLabels:
app: order-service
version: green
template:
spec:
containers:
- name: order-service
image: order-service:v2.0.0 # New version

Canary Deploymentโ€‹

v1 stable โ†โ”€โ”€ 95% traffic
v2 canary โ†โ”€โ”€ 5% traffic (monitor error rate, latency)

If v2 healthy:
v1 โ†โ”€โ”€ 80%, v2 โ†โ”€โ”€ 20%
v1 โ†โ”€โ”€ 0%, v2 โ†โ”€โ”€ 100%

If v2 degraded:
v1 โ†โ”€โ”€ 100% (instant rollback via weight change)
# Istio VirtualService: canary with header-based routing
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
# Internal QA team sees new version 100%
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: order-service
subset: v2
weight: 100
# Everyone else: 95/5 split
- route:
- destination:
host: order-service
subset: v1
weight: 95
- destination:
host: order-service
subset: v2
weight: 5

Shadow (Dark Launch) Deploymentโ€‹

Production traffic โ†’ v1 (responses served to users)
โ†“ (mirrored copy)
โ†’ v2 (response DISCARDED โ€” never sent to users)

Purpose: Compare v1 vs v2 behavior under real production traffic
without risking user experience
# Istio traffic mirroring
http:
- route:
- destination:
host: order-service
subset: v1
weight: 100
mirror:
host: order-service
subset: v2
mirrorPercentage:
value: 100.0 # Mirror 100% of requests to v2

Deployment Strategy Comparisonโ€‹

StrategyRollback SpeedRiskInfrastructure CostUse Case
RollingSlow (re-roll forward)Medium1xStandard deployments
Blue-GreenInstant (flip selector)Low2xCritical services, DB migrations
CanaryFast (reduce weight to 0)Very Low~1.1xUser-facing feature flags
ShadowN/A (no user impact)None~2xAlgorithm comparison, load testing
Feature FlagInstant (toggle flag)Very Low1xBehavioral changes, gradual rollout

Service Decomposition: Decision Frameworkโ€‹

Chapter 2 Reference

Building Microservices Chapter 2 provides the foundational guidance for finding service boundaries using Domain-Driven Design bounded contexts and the concepts of loose coupling and high cohesion.

Decomposition Criteriaโ€‹

High Cohesion: Things that change together stay together
Loose Coupling: Services need to know as little as possible about each other

Right-size a service by asking:
1. Can this service be deployed without coordinating with other teams?
2. Does it have a single, well-defined bounded context?
3. Can a small team (2-3 engineers) own it independently?
4. Does it have data that only it should write to?

Decomposition Patternsโ€‹

PatternSplit ByBest ForRisk
By Business CapabilityWhat the business does (Orders, Payments, Inventory)Aligns with org structureRequires understanding the domain
By Subdomain (DDD)Core, Supporting, Generic subdomainsStrategic alignmentNeeds domain expertise
By VolatilityHow often things change (stable vs. frequently changing)Reduces deployment frictionCan create odd boundaries
By ScaleWhich parts need to scale independentlyPerformance optimizationMay create tight coupling
By TeamConway's Law โ€” mirror the team structureReduces coordination overheadRisk of wrong boundaries if teams change

The Distributed Monolith: Warning Signsโ€‹

โŒ Signs you've built a distributed monolith:

1. Services must be deployed in a specific order
2. One service failing takes down others synchronously
3. Services share a database schema (integration database)
4. API changes require coordinating releases across 5+ services
5. Testing requires running the full system
6. No service can be scaled independently
7. Services call each other in synchronous chains (Aโ†’Bโ†’Cโ†’Dโ†’E)

โœ… Signs of healthy microservices:

1. Any service can be deployed alone, any day
2. Services have well-defined, versioned APIs
3. Each service owns its own data store
4. Services are independently testable in isolation
5. A service failure degrades gracefully (circuit breaker + fallback)
6. Teams deploy 10+ times per day independently

Database-per-Service: Data Management Patternsโ€‹

The Integration Database Problemโ€‹

Integration Database (anti-pattern):
Order Service โ”€โ”
Payment Service โ”œโ”€โ”€โ†’ Single shared DB schema
User Service โ”€โ”˜

Problems:
- Schema changes require coordinating all teams
- No service can optimize its DB technology for its workload
- Any service can read/write any other service's tables
- Tight runtime coupling โ€” one service's query can starve others

Database-per-Service (correct pattern):
Order Service โ†’ Orders DB (Postgres)
Payment Service โ†’ Payments DB (Postgres)
User Service โ†’ Users DB (Postgres + Redis cache)
Search Service โ†’ Elasticsearch

Cross-Service Data Access Patternsโ€‹

PatternMechanismConsistencyComplexityUse Case
API CompositionAggregate data via API calls at query timeStrongLowSimple joins across 2-3 services
CQRS + Event SourcingRead model built from eventsEventualHighComplex queries across many services
Saga PatternChoreography/Orchestration for writesEventualMedium-HighCross-service transactions
Shared Read ReplicaServices export data to a common analytics DBEventualMediumReporting, analytics
GraphQL FederationEach service owns its GraphQL schema, gateway federatesStrongMediumBFF/aggregation for frontends

Handling Cross-Service Queriesโ€‹

// Pattern: API Composition โ€” aggregate in the gateway/BFF layer
@Service
public class OrderSummaryService {

public OrderSummaryResponse getSummary(Long orderId) {
// Parallel fetch from multiple services
CompletableFuture<Order> orderFuture =
CompletableFuture.supplyAsync(() -> orderClient.getOrder(orderId));

CompletableFuture<PaymentStatus> paymentFuture =
CompletableFuture.supplyAsync(() -> paymentClient.getStatus(orderId));

CompletableFuture<ShipmentInfo> shipmentFuture =
CompletableFuture.supplyAsync(() -> shipmentClient.getInfo(orderId));

// Wait for all with timeout
CompletableFuture.allOf(orderFuture, paymentFuture, shipmentFuture)
.get(2, TimeUnit.SECONDS);

return OrderSummaryResponse.builder()
.order(orderFuture.get())
.payment(paymentFuture.get())
.shipment(shipmentFuture.get())
.build();
}
}

Contract Testing: Preventing Breaking Changesโ€‹

Consumer-Driven Contract Testingโ€‹

Problem: Service A depends on Service B's API.
How do you verify Service B's changes don't break Service A
without requiring both to be deployed together?

Solution: Consumer-Driven Contracts (Pact)
1. Consumer (A) writes a contract: "I expect B to respond with X to request Y"
2. Contract is published to Pact Broker
3. Provider (B) verifies it can fulfill all consumer contracts
before merging any API change
4. CI/CD blocks: B's PR fails if any consumer contract breaks
// Consumer (Order Service) โ€” define contract
@ExtendWith(PactConsumerTestExt.class)
@PactTestFor(providerName = "payment-service")
class OrderServiceContractTest {

@Pact(consumer = "order-service")
public RequestResponsePact paymentChargePact(PactDslWithProvider builder) {
return builder
.given("Payment service is available")
.uponReceiving("a charge request")
.path("/api/payments/charge")
.method("POST")
.body(new PactDslJsonBody()
.stringType("orderId")
.numberType("amount"))
.willRespondWith()
.status(200)
.body(new PactDslJsonBody()
.stringType("transactionId")
.stringValue("status", "SUCCESS"))
.toPact();
}

@Test
@PactTestFor(pactMethod = "paymentChargePact")
void shouldChargePayment(MockServer mockServer) {
paymentClient = new PaymentClient(mockServer.getUrl());
PaymentResult result = paymentClient.charge(
new ChargeRequest("order-123", 99.99));
assertThat(result.getStatus()).isEqualTo("SUCCESS");
}
}

API Versioning Strategyโ€‹

Versioning approaches:

1. URI versioning: /api/v1/orders, /api/v2/orders
Pros: Explicit, easy to route
Cons: URL fragmentation, clients must update URLs

2. Header versioning: Accept: application/vnd.myapp.v2+json
Pros: Clean URLs
Cons: Hidden, hard to test in browser

3. Semantic versioning + backwards compatibility:
Never break: existing response fields
Safe to add: new optional fields
Breaking: removing/renaming fields, changing types

4. Hypermedia (HATEOAS):
Responses include links to next actions
Clients navigate by following links, not hardcoded URLs
// Additive change (safe โ€” does not break consumers):
// Old: { "orderId": "123", "status": "COMPLETED" }
// New: { "orderId": "123", "status": "COMPLETED", "completedAt": "2024-01-01" }

// Breaking change (requires new version):
// Old: { "price": 99.99 } (number)
// New: { "price": { "amount": 99.99, "currency": "USD" } } (object)

// Tolerate unknowns in consumers
@JsonIgnoreProperties(ignoreUnknown = true) // โ† always configure this
public class PaymentResponse {
private String transactionId;
private String status;
// new fields from provider are safely ignored
}

Rate Limiting Patterns: Deep Diveโ€‹

Chapter 12 Reference

Building Microservices Chapter 12 on resilience covers rate limiting as a key stability pattern protecting services from both internal and external overload.

Algorithm Comparisonโ€‹

AlgorithmMemoryBurst HandlingFairnessUse Case
Fixed WindowO(1)Allows 2x burst at window edgePoorSimple, low accuracy needed
Sliding Window LogO(requests)Precise, no burstGoodLow traffic, high precision
Sliding Window CounterO(1)Smooth (weighted interpolation)GoodProduction APIs
Token BucketO(1)Configurable burst capacityGoodNetwork bandwidth limiting
Leaky BucketO(queue size)Smooths bursts into constant rateFairBackend protection
Token Bucket:
Tokens refill at steady rate (e.g., 100/sec)
Each request consumes 1 token
Burst allowed up to bucket capacity (e.g., 500 tokens)
Empty bucket โ†’ 429 Too Many Requests

Leaky Bucket:
Requests enter queue regardless of rate
Queue drains at fixed rate (e.g., 100/sec)
Queue full โ†’ 429 Too Many Requests
Output is always smooth โ€” protects backend

Distributed Rate Limiting with Redisโ€‹

// Sliding window rate limiter using Redis sorted set
@Service
public class RateLimiter {
private final RedisTemplate<String, String> redis;

public boolean isAllowed(String userId, int limitPerMinute) {
String key = "rate:" + userId;
long now = System.currentTimeMillis();
long windowStart = now - 60_000; // 1-minute window

// Atomic: remove old entries + add current + count
List<Object> results = redis.execute(new SessionCallback<>() {
@Override
public List<Object> execute(RedisOperations ops) {
ops.multi();
ops.opsForZSet().removeRangeByScore(key, 0, windowStart);
ops.opsForZSet().add(key, String.valueOf(now), now);
ops.opsForZSet().size(key);
ops.expire(key, Duration.ofMinutes(2));
return ops.exec();
}
});

Long count = (Long) results.get(2);
return count <= limitPerMinute;
}
}

Interview Questions: Senior Levelโ€‹

Q: How do you decide if something should be one service or two?โ€‹

A: Apply two tests from Building Microservices: loose coupling and high cohesion. If two capabilities share the same deployment cycle, same data, and the same team owns them, they belong together. If they have different scaling requirements, different change rates, or different team ownership, split them. The practical test: can each be deployed independently without coordination? If not, they are too coupled. A key smell for premature splitting is synchronous chains โ€” if Service A always calls Service B, they may be better as one service.

Q: Explain canary deployments and how you automate them safely.โ€‹

A: A canary routes a small percentage of traffic (5%) to the new version while monitoring its error rate and latency against the baseline. Automation requires: (1) routing infrastructure (Istio VirtualService weights), (2) SLO-aligned success criteria, (3) automated promotion (increase weight) and rollback (reduce weight to 0) triggered by metric thresholds. Argo Rollouts implements this with a ProgressiveDelivery controller that automatically steps from 5% to 20% to 50% to 100% if each stage passes the analysis metrics, or rolls back if any stage breaches the error budget.

Q: What is consumer-driven contract testing? How does it differ from end-to-end testing?โ€‹

A: Consumer-driven contract tests verify that a service honors the expectations its consumers have documented. The consumer writes the contract (expected request/response), and the provider's CI verifies it can fulfill that contract in isolation. This is far cheaper than end-to-end tests that require all services to be running, and catches breaking API changes before they reach production. The key benefit is that providers get immediate feedback if their change breaks a consumer, without needing a shared integration environment.

Q: How do you handle the need for cross-service transactions in a microservices system?โ€‹

A: Avoid blocking distributed transaction protocols (like 2PC). Instead, implement eventual consistency patterns like the Saga Pattern (using Orchestration or Choreography to coordinate local ACID transactions and semantic compensations) and the Transactional Outbox Pattern (to solve the dual-write problem). For complete architectural schemas and implementation code, see the Saga Pattern Guide and Transactional Outbox Pattern Guide.

Q: What are the operational challenges of microservices that monoliths avoid?โ€‹

A: (1) Distributed tracing โ€” request debugging requires correlating logs across services. (2) Data consistency โ€” no cross-service transactions; eventual consistency adds complexity. (3) Network overhead โ€” every call is a network hop with latency, failure probability, and serialization cost. (4) Testing โ€” integration tests require running many services. (5) Operational overhead โ€” N services means N CI pipelines, N dashboards, N alert pages. (6) Service discovery โ€” dynamic instance registration required. (7) API versioning โ€” cannot do breaking changes freely. Building Microservices emphasizes: these costs are real and must be justified by the scale and team-size benefits microservices provide.