Skip to main content

Redis Clustering, Replication & High Availability


Replication​

Redis uses asynchronous master-replica replication. All writes go to the master; replicas receive a copy asynchronously.

πŸ‘Ά Beginner Concept: The "Franchise Recipe" Analogy​

Imagine you own a famous pizza franchise.

  • The Master (Headquarters): This is the only place allowed to write new recipes (WRITE operations). When the Master creates a new pizza, it instantly sends a fax (asynchronous background stream) with the new recipe to all its franchise stores.
  • The Replicas (Franchise Stores): These stores are strictly read-only. Customers can walk into any branch to order (READ operations), and they will get the exact same pizza. If a branch burns down, the Master doesn't care; it just builds a new one and faxes ALL the recipes to it from scratch (Full Sync). If the Master burns down, one of the branches must be officially promoted to Headquarters to accept new recipes.
Master (read + write)
β”œβ”€β”€ Replica 1 (read-only) ← async replication stream
β”œβ”€β”€ Replica 2 (read-only)
└── Replica 3 (read-only)

How Replication Works​

1. Replica connects to master
2. Master sends RDB snapshot (FULLRESYNC)
3. While snapshot is being transferred, master buffers new write commands
4. Replica loads RDB, then applies buffered commands
5. Ongoing: master streams commands to replica via replication backlog
# redis.conf on replica
replicaof master-host 6379
replica-read-only yes

# Monitor replication lag
INFO replication
# replica_lag: seconds behind master
# master_repl_offset vs replica_repl_offset

Replication Lag and Consistency​

Redis replication is asynchronous by default β€” replicas may be behind the master. A failover during lag causes data loss.

# Semi-synchronous: master waits for at least N replicas to acknowledge writes
# (Best effort β€” not true synchronous)
min-replicas-to-write 1 # Must have 1 replica acknowledge before ACKing client
min-replicas-max-lag 10 # Replica must respond within 10 seconds
# If condition not met β†’ master refuses writes (protects consistency)

Read replicas for read scaling:

// Lettuce (Spring) read from replicas for read-heavy workloads
LettuceClientConfiguration config = LettuceClientConfiguration.builder()
.readFrom(ReadFrom.REPLICA_PREFERRED) // Prefer replica, fallback to master
.build();

Redis Sentinel β€” High Availability​

Sentinel provides automatic failover for Redis without sharding. Consists of 3+ Sentinel processes (odd number for quorum).

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Sentinel Cluster (quorum) β”‚
β”‚ Sentinel 1 β”‚
β”‚ Sentinel 2 ← majority vote β”‚
β”‚ Sentinel 3 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ monitors
↓
[ Master ]
/ \
[Replica 1] [Replica 2]

Failover Process​

1. Sentinel detects master is unreachable (subjective down)
2. If quorum Sentinels agree β†’ objective down (ODOWN)
3. Sentinels elect a leader Sentinel
4. Leader promotes the most up-to-date replica to master
5. Other replicas reconfigure to follow new master
6. Old master (if it recovers) becomes a replica
# sentinel.conf
sentinel monitor mymaster 127.0.0.1 6379 2 # Quorum = 2
sentinel down-after-milliseconds mymaster 5000 # Unreachable for 5s = SDOWN
sentinel failover-timeout mymaster 10000 # Failover must complete in 10s
sentinel parallel-syncs mymaster 1 # 1 replica syncs at a time during failover

Failover time: Typically 15–30 seconds (detection + election + promotion). During this time: no writes (old master is down, new not yet promoted).

// Spring Boot Sentinel configuration
@Bean
public RedisConnectionFactory redisConnectionFactory() {
RedisSentinelConfiguration sentinelConfig = new RedisSentinelConfiguration()
.master("mymaster")
.sentinel("sentinel1", 26379)
.sentinel("sentinel2", 26379)
.sentinel("sentinel3", 26379);
return new LettuceConnectionFactory(sentinelConfig);
}

🧠 Senior Deep Dive: Redis Cluster & Hash Slots​

When your dataset exceeds the RAM of a single physical server (e.g., 500GB of cache), Sentinel is useless because Sentinel still copies 100% of the data to every node. You need Horizontal Sharding.

Redis Cluster shards data across multiple master nodes using exactly 16,384 Hash Slots.

16,384 slots distributed across 3 masters:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Master A β”‚ Master B β”‚ Master C β”‚
β”‚ Slots 0–5460 β”‚ Slots 5461–10922β”‚ Slots 10923–16383β”‚
β”‚ └── Replica A β”‚ └── Replica B β”‚ └── Replica C β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why Exactly 16,384 Slots?​

Every node in a Redis Cluster constantly pings every other node (the Gossip Protocol) to share its state. The payload of this ping includes a bitmap of which Hash Slots the node currently owns.

  • A bitmap of 16,384 bits is exactly 2 Kilobytes (a tiny, lightning-fast payload).
  • If Redis used 65,536 slots (like standard CRC16 max), the heartbeat payload would jump to 8 Kilobytes, completely choking the network bandwidth just for background gossip! 16,384 was strictly chosen to balance fine-grained sharding with network payload efficiency.

How Slot Assignment Works​

# Key β†’ Slot mapping (CRC16 hash)
CLUSTER KEYSLOT mykey # β†’ e.g., 14328

# Keys with hash tags go to same slot
CLUSTER KEYSLOT "{user:123}.profile" # Same slot as {user:123}.orders
CLUSTER KEYSLOT "{user:123}.orders"

This matters for: Multi-key commands, MGET, MSET, and Lua scripts β€” all keys must be in the same slot.

MOVED and ASK Redirects​

Client β†’ Node A: GET mykey
← MOVED 14328 redis-node-c:6379 # Client must reconnect to correct node
Client β†’ Node C: GET mykey
← "myvalue"

Smart clients (Lettuce, Jedis) handle redirects automatically and cache the slot routing table.

ASK 14328 redis-node-d:6379 # Temporary redirect during slot migration (resharding)

Cluster Configuration​

# Create cluster (minimum 3 masters, recommended 3 masters + 3 replicas)
redis-cli --cluster create \
node1:6379 node2:6379 node3:6379 \
node4:6379 node5:6379 node6:6379 \
--cluster-replicas 1

# Check cluster status
CLUSTER INFO
CLUSTER NODES
CLUSTER SLOTS # Shows slot→node mapping

Cross-Slot Operations​

# ❌ These fail in Cluster β€” keys on different slots
MSET user:1 "Alice" user:2 "Bob" # Different slots!
MGET user:1 user:2
SUNIONSTORE result set1 set2

# βœ… Use hash tags to co-locate keys
MSET {user}.1 "Alice" {user}.2 "Bob" # Same slot β†’ works!
MGET {user}.1 {user}.2

# Or: accept the constraint and use per-key operations
GET user:1
GET user:2 # Two round-trips, but cluster-safe

Spring Boot with Cluster​

In production, enable topology refresh to handle slot migrations or node failures gracefully:

@Configuration
public class ClusterConfig {

@Bean
public RedisConnectionFactory redisConnectionFactory() {
RedisClusterConfiguration clusterConfig =
new RedisClusterConfiguration(Arrays.asList(
"redis-node1:6379", "redis-node2:6379", "redis-node3:6379"
));

clusterConfig.setMaxRedirects(3);

ClusterTopologyRefreshOptions topologyRefreshOptions =
ClusterTopologyRefreshOptions.builder()
.enablePeriodicRefresh(Duration.ofSeconds(60))
.enableAllAdaptiveRefreshTriggers()
.build();

ClusterClientOptions clientOptions = ClusterClientOptions.builder()
.topologyRefreshOptions(topologyRefreshOptions)
.build();

LettuceClientConfiguration clientConfig = LettuceClientConfiguration.builder()
.clientOptions(clientOptions)
.readFrom(ReadFrom.REPLICA_PREFERRED) // Read scaling
.build();

return new LettuceConnectionFactory(clusterConfig, clientConfig);
}
}

Sentinel vs Cluster​

SentinelCluster
PurposeHA failoverHA + horizontal scaling
Data sharding❌ (all data on every node)βœ… (sharded across nodes)
Max dataset sizeBounded by RAM of single nodeN Γ— RAM
Write throughputSingle nodeN Γ— (single node)
Multi-key operationsβœ… Always possible❌ Same-slot only
Operational complexityMediumHigh
Minimum nodes1 master + 2 replicas + 3 Sentinels6 (3 masters + 3 replicas)
Client complexitySentinel discoveryCluster routing

Choose Sentinel when: dataset fits comfortably in RAM of one node, operations are simple, want predictable multi-key behavior.

Choose Cluster when: dataset exceeds single-node RAM, need write throughput scaling, building for large scale.


Production Deployment Best Practices​

Anti-Affinity​

Always place master and its replica on different hosts (ideally different AZs):

# Kubernetes: anti-affinity for Redis master + replica
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: redis
topologyKey: kubernetes.io/hostname

🧠 Senior Deep Dive: Split-Brain & The Gossip Protocol​

In a cluster, nodes talk to each other to detect failures using a Gossip Protocol. If a network partition occurs (a cable gets cut), the cluster splits in half.

Imagine [Master A + Master B] on one side of the cut, and [Master C + Replicas A, B, C] on the other.

  • The side with Master C realizes A and B are "dead" (unreachable). So, it promotes Replicas A and B to become the new Masters!
  • Meanwhile, the side with original Master A and B continues accepting Writes from clients because those masters don't know they've been cut off!
  • The Result: Two Masters accepting writes for the exact same Hash Slots independently. When the cable is fixed, the cluster panics because it cannot mathematically merge the conflicted data.

The Fix: You must configure the min-replicas-to-write parameter.

min-replicas-to-write 1 # Master refuses writes if it can't see any replicas
# β†’ In isolated partition, Master A immediately stops accepting writes. Split-brain prevented.

Slow Log​

slowlog-log-slower-than 10000 # Log commands taking > 10ms (in microseconds)
slowlog-max-len 128 # Keep last 128 slow commands

SLOWLOG GET 10 # See last 10 slow commands
SLOWLOG RESET

Spring Boot Custom Health Check​

@Component
public class RedisHealthIndicator extends AbstractHealthIndicator {

@Autowired
private RedisTemplate<String, Object> redisTemplate;

@Override
protected void doHealthCheck(Health.Builder builder) {
try {
redisTemplate.execute((RedisCallback<String>) conn -> {
conn.ping();
return "PONG";
});
builder.up().withDetail("status", "Reachable");
} catch (Exception e) {
builder.down(e).withDetail("error", e.getMessage());
}
}
}

Interview Questions​

Q: How do you decide between Sentinel and Cluster for a new platform?​

A: Use Sentinel for simpler HA when data fits one primary; use Cluster for horizontal write and memory scaling.

Q: What failure behavior should teams expect during Sentinel failover?​

A: A short write interruption during detection, election, promotion, and client re-discovery.

Q: Why do hash tags matter in Redis Cluster design?​

A: They co-locate related keys in one slot, enabling safe multi-key operations.

Q: How do you reduce data loss risk with async replication?​

A: Tune min-replicas constraints, monitor lag, and pair with durable persistence strategy.

Q: What is a common operational anti-pattern in Redis HA setups?​

A: Placing primaries and replicas on the same failure domain, defeating failover objectives.

Q: Which cluster metrics should trigger urgent investigation?​

A: Replication lag growth, failed failovers, slot migration instability, and elevated client redirect errors.