Redis Clustering, Replication & High Availability
Replicationβ
Redis uses asynchronous master-replica replication. All writes go to the master; replicas receive a copy asynchronously.
πΆ Beginner Concept: The "Franchise Recipe" Analogyβ
Imagine you own a famous pizza franchise.
- The Master (Headquarters): This is the only place allowed to write new recipes (WRITE operations). When the Master creates a new pizza, it instantly sends a fax (asynchronous background stream) with the new recipe to all its franchise stores.
- The Replicas (Franchise Stores): These stores are strictly read-only. Customers can walk into any branch to order (READ operations), and they will get the exact same pizza. If a branch burns down, the Master doesn't care; it just builds a new one and faxes ALL the recipes to it from scratch (Full Sync). If the Master burns down, one of the branches must be officially promoted to Headquarters to accept new recipes.
Master (read + write)
βββ Replica 1 (read-only) β async replication stream
βββ Replica 2 (read-only)
βββ Replica 3 (read-only)
How Replication Worksβ
1. Replica connects to master
2. Master sends RDB snapshot (FULLRESYNC)
3. While snapshot is being transferred, master buffers new write commands
4. Replica loads RDB, then applies buffered commands
5. Ongoing: master streams commands to replica via replication backlog
# redis.conf on replica
replicaof master-host 6379
replica-read-only yes
# Monitor replication lag
INFO replication
# replica_lag: seconds behind master
# master_repl_offset vs replica_repl_offset
Replication Lag and Consistencyβ
Redis replication is asynchronous by default β replicas may be behind the master. A failover during lag causes data loss.
# Semi-synchronous: master waits for at least N replicas to acknowledge writes
# (Best effort β not true synchronous)
min-replicas-to-write 1 # Must have 1 replica acknowledge before ACKing client
min-replicas-max-lag 10 # Replica must respond within 10 seconds
# If condition not met β master refuses writes (protects consistency)
Read replicas for read scaling:
// Lettuce (Spring) read from replicas for read-heavy workloads
LettuceClientConfiguration config = LettuceClientConfiguration.builder()
.readFrom(ReadFrom.REPLICA_PREFERRED) // Prefer replica, fallback to master
.build();
Redis Sentinel β High Availabilityβ
Sentinel provides automatic failover for Redis without sharding. Consists of 3+ Sentinel processes (odd number for quorum).
ββββββββββββββββββββββββββββββββ
β Sentinel Cluster (quorum) β
β Sentinel 1 β
β Sentinel 2 β majority vote β
β Sentinel 3 β
ββββββββββββββββββββββββββββββββ
β monitors
β
[ Master ]
/ \
[Replica 1] [Replica 2]
Failover Processβ
1. Sentinel detects master is unreachable (subjective down)
2. If quorum Sentinels agree β objective down (ODOWN)
3. Sentinels elect a leader Sentinel
4. Leader promotes the most up-to-date replica to master
5. Other replicas reconfigure to follow new master
6. Old master (if it recovers) becomes a replica
# sentinel.conf
sentinel monitor mymaster 127.0.0.1 6379 2 # Quorum = 2
sentinel down-after-milliseconds mymaster 5000 # Unreachable for 5s = SDOWN
sentinel failover-timeout mymaster 10000 # Failover must complete in 10s
sentinel parallel-syncs mymaster 1 # 1 replica syncs at a time during failover
Failover time: Typically 15β30 seconds (detection + election + promotion). During this time: no writes (old master is down, new not yet promoted).
// Spring Boot Sentinel configuration
@Bean
public RedisConnectionFactory redisConnectionFactory() {
RedisSentinelConfiguration sentinelConfig = new RedisSentinelConfiguration()
.master("mymaster")
.sentinel("sentinel1", 26379)
.sentinel("sentinel2", 26379)
.sentinel("sentinel3", 26379);
return new LettuceConnectionFactory(sentinelConfig);
}
π§ Senior Deep Dive: Redis Cluster & Hash Slotsβ
When your dataset exceeds the RAM of a single physical server (e.g., 500GB of cache), Sentinel is useless because Sentinel still copies 100% of the data to every node. You need Horizontal Sharding.
Redis Cluster shards data across multiple master nodes using exactly 16,384 Hash Slots.
16,384 slots distributed across 3 masters:
ββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββββββββ
β Master A β Master B β Master C β
β Slots 0β5460 β Slots 5461β10922β Slots 10923β16383β
β βββ Replica A β βββ Replica B β βββ Replica C β
ββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ
Why Exactly 16,384 Slots?β
Every node in a Redis Cluster constantly pings every other node (the Gossip Protocol) to share its state. The payload of this ping includes a bitmap of which Hash Slots the node currently owns.
- A bitmap of 16,384 bits is exactly 2 Kilobytes (a tiny, lightning-fast payload).
- If Redis used 65,536 slots (like standard CRC16 max), the heartbeat payload would jump to 8 Kilobytes, completely choking the network bandwidth just for background gossip! 16,384 was strictly chosen to balance fine-grained sharding with network payload efficiency.
How Slot Assignment Worksβ
# Key β Slot mapping (CRC16 hash)
CLUSTER KEYSLOT mykey # β e.g., 14328
# Keys with hash tags go to same slot
CLUSTER KEYSLOT "{user:123}.profile" # Same slot as {user:123}.orders
CLUSTER KEYSLOT "{user:123}.orders"
This matters for: Multi-key commands, MGET, MSET, and Lua scripts β all keys must be in the same slot.
MOVED and ASK Redirectsβ
Client β Node A: GET mykey
β MOVED 14328 redis-node-c:6379 # Client must reconnect to correct node
Client β Node C: GET mykey
β "myvalue"
Smart clients (Lettuce, Jedis) handle redirects automatically and cache the slot routing table.
ASK 14328 redis-node-d:6379 # Temporary redirect during slot migration (resharding)
Cluster Configurationβ
# Create cluster (minimum 3 masters, recommended 3 masters + 3 replicas)
redis-cli --cluster create \
node1:6379 node2:6379 node3:6379 \
node4:6379 node5:6379 node6:6379 \
--cluster-replicas 1
# Check cluster status
CLUSTER INFO
CLUSTER NODES
CLUSTER SLOTS # Shows slotβnode mapping
Cross-Slot Operationsβ
# β These fail in Cluster β keys on different slots
MSET user:1 "Alice" user:2 "Bob" # Different slots!
MGET user:1 user:2
SUNIONSTORE result set1 set2
# β
Use hash tags to co-locate keys
MSET {user}.1 "Alice" {user}.2 "Bob" # Same slot β works!
MGET {user}.1 {user}.2
# Or: accept the constraint and use per-key operations
GET user:1
GET user:2 # Two round-trips, but cluster-safe
Spring Boot with Clusterβ
In production, enable topology refresh to handle slot migrations or node failures gracefully:
@Configuration
public class ClusterConfig {
@Bean
public RedisConnectionFactory redisConnectionFactory() {
RedisClusterConfiguration clusterConfig =
new RedisClusterConfiguration(Arrays.asList(
"redis-node1:6379", "redis-node2:6379", "redis-node3:6379"
));
clusterConfig.setMaxRedirects(3);
ClusterTopologyRefreshOptions topologyRefreshOptions =
ClusterTopologyRefreshOptions.builder()
.enablePeriodicRefresh(Duration.ofSeconds(60))
.enableAllAdaptiveRefreshTriggers()
.build();
ClusterClientOptions clientOptions = ClusterClientOptions.builder()
.topologyRefreshOptions(topologyRefreshOptions)
.build();
LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
.clientOptions(clientOptions)
.readFrom(ReadFrom.REPLICA_PREFERRED) // Read scaling
.build();
return new LettuceConnectionFactory(clusterConfig, clientConfig);
}
}
Sentinel vs Clusterβ
| Sentinel | Cluster | |
|---|---|---|
| Purpose | HA failover | HA + horizontal scaling |
| Data sharding | β (all data on every node) | β (sharded across nodes) |
| Max dataset size | Bounded by RAM of single node | N Γ RAM |
| Write throughput | Single node | N Γ (single node) |
| Multi-key operations | β Always possible | β Same-slot only |
| Operational complexity | Medium | High |
| Minimum nodes | 1 master + 2 replicas + 3 Sentinels | 6 (3 masters + 3 replicas) |
| Client complexity | Sentinel discovery | Cluster routing |
Choose Sentinel when: dataset fits comfortably in RAM of one node, operations are simple, want predictable multi-key behavior.
Choose Cluster when: dataset exceeds single-node RAM, need write throughput scaling, building for large scale.
Production Deployment Best Practicesβ
Anti-Affinityβ
Always place master and its replica on different hosts (ideally different AZs):
# Kubernetes: anti-affinity for Redis master + replica
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: redis
topologyKey: kubernetes.io/hostname
π§ Senior Deep Dive: Split-Brain & The Gossip Protocolβ
In a cluster, nodes talk to each other to detect failures using a Gossip Protocol. If a network partition occurs (a cable gets cut), the cluster splits in half.
Imagine [Master A + Master B] on one side of the cut, and [Master C + Replicas A, B, C] on the other.
- The side with Master C realizes A and B are "dead" (unreachable). So, it promotes Replicas A and B to become the new Masters!
- Meanwhile, the side with original Master A and B continues accepting Writes from clients because those masters don't know they've been cut off!
- The Result: Two Masters accepting writes for the exact same Hash Slots independently. When the cable is fixed, the cluster panics because it cannot mathematically merge the conflicted data.
The Fix: You must configure the min-replicas-to-write parameter.
min-replicas-to-write 1 # Master refuses writes if it can't see any replicas
# β In isolated partition, Master A immediately stops accepting writes. Split-brain prevented.
Slow Logβ
slowlog-log-slower-than 10000 # Log commands taking > 10ms (in microseconds)
slowlog-max-len 128 # Keep last 128 slow commands
SLOWLOG GET 10 # See last 10 slow commands
SLOWLOG RESET
Spring Boot Custom Health Checkβ
@Component
public class RedisHealthIndicator extends AbstractHealthIndicator {
@Autowired
private RedisTemplate<String, Object> redisTemplate;
@Override
protected void doHealthCheck(Health.Builder builder) {
try {
redisTemplate.execute((RedisCallback<String>) conn -> {
conn.ping();
return "PONG";
});
builder.up().withDetail("status", "Reachable");
} catch (Exception e) {
builder.down(e).withDetail("error", e.getMessage());
}
}
}
Interview Questionsβ
Q: How do you decide between Sentinel and Cluster for a new platform?β
A: Use Sentinel for simpler HA when data fits one primary; use Cluster for horizontal write and memory scaling.
Q: What failure behavior should teams expect during Sentinel failover?β
A: A short write interruption during detection, election, promotion, and client re-discovery.
Q: Why do hash tags matter in Redis Cluster design?β
A: They co-locate related keys in one slot, enabling safe multi-key operations.
Q: How do you reduce data loss risk with async replication?β
A: Tune min-replicas constraints, monitor lag, and pair with durable persistence strategy.
Q: What is a common operational anti-pattern in Redis HA setups?β
A: Placing primaries and replicas on the same failure domain, defeating failover objectives.
Q: Which cluster metrics should trigger urgent investigation?β
A: Replication lag growth, failed failovers, slot migration instability, and elevated client redirect errors.