Consistent Hashing: An Advanced Architectural Guide

In distributed systems, managing the deterministic placement of state across multiple transient nodes is a foundational challenge. Whether you are designing a distributed cache, a partitioned database, or a message queue, the routing topology directly dictates the system's elasticity and resilience.

This guide breaks down the progression from naive hashing to Consistent Hashing, explicitly focusing on the underlying mathematics, routing mechanics, real-world implementation details, and senior-level system design considerations.

📚 Table of Contents

The Bottleneck: Naive Modulo Hashing
The Solution: Consistent Hashing
Resolving Data Skew: Virtual Nodes
High Availability: Replication on the Ring
How Consistent Hashing Works Internally
Real-World Implementations
Integration Patterns
Pros and Cons
Interview Questions
Senior Deep Dive: Advanced Topics

🚧 1. The Bottleneck: Naive Modulo Hashing

When horizontally scaling a stateful tier (e.g., moving from 1 cache node to 3), the immediate architectural question is: How do we determine which server stores which piece of data? The standard, naive approach uses a simple Hash and Modulo function:

The Naive Approach

Pass the routing key (e.g., user_id or session_token) through a hashing algorithm (like MD5 or MurmurHash).
Apply the modulo operator against the current total number of servers $N$ .
Route the data to the resulting index: $Index = hash(key) \bmod N$

Code Example

def get_server_naive(key, num_servers):
    hash_value = hash(key)  # or use MD5, MurmurHash, etc.
    return hash_value % num_servers

# Example
server = get_server_naive("user_12345", 3)  # Returns 0, 1, or 2

The Mass Redistribution Cascade

This deterministic approach works flawlessly in a static environment but fails catastrophically during scaling events. If traffic spikes and a 4th server is added, the denominator $N$ in the modulo function changes from $3$ to $4$ .

Because the modulo base has changed, the mathematical output for almost every single existing key changes. For example, a hash value of $67211 \bmod 3$ routes to Server $2$ . But $67211 \bmod 4$ routes to Server $3$ .

The Math of Failure

When the number of nodes changes from $N$ to $N+1$ , the probability that a key must be remapped is $N/(N+1)$ . In a 10-node cluster, adding one node forces 90% of the existing data to move. This triggers a massive "thundering herd" of network I/O, cache misses, and database thrashing, often causing cascading failures across the entire backend architecture.

Visual Example

Initial State (3 servers):
┌─────────────────────────────────────────────────────────────┐
│ Key: user_1  → hash: 12345  → 12345 % 3 = 0  → Server A     │
│ Key: user_2  → hash: 67890  → 67890 % 3 = 0  → Server A     │
│ Key: user_3  → hash: 11111  → 11111 % 3 = 1  → Server B     │
│ Key: user_4  → hash: 22222  → 22222 % 3 = 2  → Server C     │
└─────────────────────────────────────────────────────────────┘

After Adding Server D (4 servers):
┌─────────────────────────────────────────────────────────────┐
│ Key: user_1  → hash: 12345  → 12345 % 4 = 1  → Server B ❌  │
│ Key: user_2  → hash: 67890  → 67890 % 4 = 2  → Server C ❌  │
│ Key: user_3  → hash: 11111  → 11111 % 4 = 3  → Server D ❌  │
│ Key: user_4  → hash: 22222  → 22222 % 4 = 2  → Server C ❌  │
└─────────────────────────────────────────────────────────────┘
Result: ALL keys moved to different servers!

Impact on Production Systems

Cache Miss Storm:

When 90% of keys are remapped, the cache hit rate drops from ~95% to ~5%
Database load increases 20x, potentially causing database overload
Application latency increases dramatically
User experience degrades significantly

Data Migration Cost:

Network bandwidth consumed by moving terabytes of data
Extended migration windows (hours to days)
Increased risk of data loss during migration
Complex rollback procedures

✅ 2. The Solution: Consistent Hashing

To safely scale without catastrophic data movement, we must decouple the data mapping from the immediate number of active servers. Consistent Hashing achieves this by projecting both the data keys and the server identifiers onto a massive, fixed geometric space.

2.1 The Hash Ring Mechanics

Concept Overview

The Hash Space: We construct an abstract, circular ring representing a vast integer space. For instance, using SHA-1 yields a 160-bit hash space, ranging from $0$ to $2^{160} - 1$ .
Placing Servers: We hash the unique identifiers of the servers (e.g., their IP addresses or hostnames) using the exact same hash function. We place them as fixed coordinates on this ring.
Routing Data: To store or retrieve an item, we hash the item's key to find its coordinate on the ring. We then walk clockwise until we encounter the first server node. That server becomes the data's authoritative owner.

Visual Representation

                    0
                  ┌───┐
              255 │   │ 1
          ┌────────┤   ├────────┐
          │        └───┘        │
      192 │                   │ 64
          │                   │
          │                   │
      128 │                   │ 128
          │                   │
          │        ┌───┐      │
          └────────┤   ├──────┘
              65  │   │ 192
                  └───┘
                    255

Hash Ring (0-255 for simplicity)

Server A: hash("10.0.0.1") = 42
Server B: hash("10.0.0.2") = 150
Server C: hash("10.0.0.3") = 210

Key "user_1": hash("user_1") = 100 → Route to Server B (first clockwise)
Key "user_2": hash("user_2") = 200 → Route to Server C (first clockwise)
Key "user_3": hash("user_3") = 300 → Wrap to 0, route to Server A

2.2 Time Complexity and Implementation

In software, the "ring" is typically implemented as a self-balancing binary search tree (like a Red-Black Tree or Java's TreeMap) or a simple sorted array.

Implementation Options

Option 1: Sorted Array with Binary Search

import bisect

class ConsistentHash:
    def __init__(self, replicas=1):
        self.replicas = replicas
        self.ring = []
        self.sorted_keys = []

    def _hash(self, key):
        # Use a good hash function like MD5 or MurmurHash
        import hashlib
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node):
        for i in range(self.replicas):
            virtual_node = f"{node}#{i}"
            key = self._hash(virtual_node)
            self.ring.append((key, node))
        self.ring.sort()
        self.sorted_keys = [k for k, _ in self.ring]

    def get_node(self, key):
        if not self.ring:
            return None
        hash_value = self._hash(key)
        # Binary search to find the first node with hash >= key
        idx = bisect.bisect_left(self.sorted_keys, hash_value)
        if idx == len(self.sorted_keys):
            idx = 0  # Wrap around
        return self.ring[idx][1]

Option 2: TreeMap (Java)

import java.util.TreeMap;

public class ConsistentHash<T> {
    private final TreeMap<Long, T> ring = new TreeMap<>();
    private final int replicas;

    public ConsistentHash(int replicas) {
        this.replicas = replicas;
    }

    private long hash(String key) {
        // Use MurmurHash or similar
        return MurmurHash.hash64(key.getBytes());
    }

    public void addNode(T node) {
        for (int i = 0; i < replicas; i++) {
            long hash = hash(node.toString() + "#" + i);
            ring.put(hash, node);
        }
    }

    public T getNode(String key) {
        if (ring.isEmpty()) {
            return null;
        }
        long hash = hash(key);
        // Find the first node with hash >= key
        Map.Entry<Long, T> entry = ring.ceilingEntry(hash);
        if (entry == null) {
            // Wrap around to the first node
            entry = ring.firstEntry();
        }
        return entry.getValue();
    }
}

Performance Characteristics

Lookup Time: $O(\log S)$ , where $S$ is the number of servers (or virtual nodes).
Add/Remove Node: $O(S)$ for updating the ring structure.
Memory Usage: $O(S \times R)$ , where $R$ is the number of replicas per node.

2.3 Graceful Scaling

Because mapping is determined by clockwise proximity rather than a rigid modulo base, scaling isolates data movement:

Adding a Node

If a new Server $X$ is added between Server $A$ and Server $B$ , it only intercepts data hashed between $A$ and $X$ . Only this specific, localized subset of data migrates from $B$ to $X$ . The rest of the cluster is completely undisturbed.

Example:

Before adding Server D:
┌─────────────────────────────────────────────────────────────┐
│ Ring: A(42) ── B(150) ── C(210) ── (wrap to A)              │
│                                                              │
│ Keys in range [42, 150):  → Server A                        │
│ Keys in range [150, 210): → Server B                        │
│ Keys in range [210, 42):  → Server C                        │
└─────────────────────────────────────────────────────────────┘

After adding Server D at position 100:
┌─────────────────────────────────────────────────────────────┐
│ Ring: A(42) ── D(100) ── B(150) ── C(210) ── (wrap to A)   │
│                                                              │
│ Keys in range [42, 100):  → Server A (unchanged) ✓          │
│ Keys in range [100, 150): → Server D (NEW!)                 │
│ Keys in range [150, 210): → Server B (unchanged) ✓          │
│ Keys in range [210, 42):  → Server C (unchanged) ✓          │
└─────────────────────────────────────────────────────────────┘

Result: Only keys in [100, 150) moved from B to D (~25% of data)

Removing a Node

If Server $B$ goes offline, its assigned data simply "falls forward" clockwise to Server $C$ .

Example:

Before removing Server B:
┌─────────────────────────────────────────────────────────────┐
│ Ring: A(42) ── B(150) ── C(210) ── (wrap to A)              │
│                                                              │
│ Keys in range [150, 210): → Server B                        │
└─────────────────────────────────────────────────────────────┘

After removing Server B:
┌─────────────────────────────────────────────────────────────┐
│ Ring: A(42) ── C(210) ── (wrap to A)                        │
│                                                              │
│ Keys in range [150, 210): → Server C (inherited from B)     │
└─────────────────────────────────────────────────────────────┘

Result: Only keys in [150, 210) moved from B to C

2.4 Mathematical Properties

Consistency Property

Consistent hashing satisfies the consistency property: if the hash function is consistent, then adding or removing a server only affects keys that hash to positions near that server.

Formal Definition: For any key $k$ and set of servers $S$ , let $f(k, S)$ be the server that stores key $k$ . Consistent hashing ensures that for any two sets of servers $S_1$ and $S_2$ that differ by only one server, the set of keys that are assigned to different servers is bounded.

Load Distribution

With $N$ servers and $K$ keys, the expected number of keys per server is $K/N$ . The variance depends on the hash function quality and the number of virtual nodes.

⚖️ 3. Resolving Data Skew: Virtual Nodes

While the standard ring solves mass redistribution, it suffers from severe data skew. Hashing a small number of physical servers (e.g., 5 nodes) onto a massive ring almost never results in an even distribution. Furthermore, if a node crashes, its clockwise neighbor inherits 100% of its load, instantly doubling its capacity and risking a cascading failure.

The Problem with Physical Nodes Only

Example of Skew:

Ring with 3 physical nodes:
┌─────────────────────────────────────────────────────────────┐
│ Server A: hash = 42 (13% of ring space)                     │
│ Server B: hash = 150 (42% of ring space) ❌                  │
│ Server C: hash = 210 (45% of ring space) ❌                  │
│                                                              │
│ Result: Server B and C handle 87% of traffic!               │
└─────────────────────────────────────────────────────────────┘

The Virtual Nodes (V-Nodes) Pattern

To balance the load and handle heterogeneous hardware (servers with different capacities), we introduce Virtual Nodes.

Instead of mapping a physical server to a single point on the ring, we map it to multiple points. We achieve this by appending a sequence to the server's identifier before hashing:

hash("10.0.0.1#v1")
hash("10.0.0.1#v2")
hash("10.0.0.1#v3")

How Virtual Nodes Work

Physical Server A with 3 virtual nodes:
┌─────────────────────────────────────────────────────────────┐
│ Virtual Node A#1: hash("10.0.0.1#v1") = 42                  │
│ Virtual Node A#2: hash("10.0.0.1#v2") = 120                 │
│ Virtual Node A#3: hash("10.0.0.1#v3") = 200                 │
│                                                              │
│ All three virtual nodes route to the same physical server A │
└─────────────────────────────────────────────────────────────┘

Benefits of Virtual Nodes

Feature	Impact of Virtual Nodes
Variance Reduction	By placing hundreds of virtual nodes per physical server across the ring, the standard deviation of data distribution drops significantly, approaching a perfectly even split.
Load Spreading	If Physical Server $A$ dies, its dozens of virtual nodes disappear from the ring. The data belonging to those V-Nodes falls forward to numerous different physical servers, smoothly absorbing the orphaned load across the entire cluster.
Weighting	A robust bare-metal server can be assigned 500 V-Nodes, while an older, smaller instance might only be assigned 100, allowing proportional load balancing.
Hotspot Mitigation	Virtual nodes distribute load more evenly, preventing any single physical server from becoming a hotspot.

Choosing the Number of Virtual Nodes

Rule of Thumb:

Small clusters (< 10 nodes): 100-200 virtual nodes per physical node
Medium clusters (10-100 nodes): 50-100 virtual nodes per physical node
Large clusters (> 100 nodes): 10-50 virtual nodes per physical node

Trade-offs:

More virtual nodes: Better load balancing, but higher memory usage and slower lookups
Fewer virtual nodes: Faster lookups, but potentially uneven distribution

Implementation Example

class ConsistentHashWithVirtualNodes:
    def __init__(self, virtual_nodes=100):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []

    def _hash(self, key):
        import hashlib
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node, weight=1):
        # Adjust virtual nodes based on weight
        num_virtual = self.virtual_nodes * weight
        for i in range(num_virtual):
            virtual_node = f"{node}#{i}"
            key = self._hash(virtual_node)
            self.ring[key] = node
        self.sorted_keys = sorted(self.ring.keys())

    def get_node(self, key):
        if not self.ring:
            return None
        hash_value = self._hash(key)
        # Find the first virtual node with hash >= key
        for ring_key in self.sorted_keys:
            if ring_key >= hash_value:
                return self.ring[ring_key]
        # Wrap around
        return self.ring[self.sorted_keys[0]]

    def remove_node(self, node):
        keys_to_remove = []
        for ring_key, ring_node in self.ring.items():
            if ring_node == node:
                keys_to_remove.append(ring_key)
        for key in keys_to_remove:
            del self.ring[key]
        self.sorted_keys = sorted(self.ring.keys())

🔄 4. High Availability: Replication on the Ring

In a production system (like Cassandra or DynamoDB), storing data on a single node is unacceptable due to the risk of hardware failure. Consistent hashing natively supports deterministic replication without central coordination.

Replication Strategy

When a write request arrives, the coordinator node finds the primary owner by walking the ring clockwise. To replicate the data with a Replication Factor of $R$ (e.g., $R=3$ ):

The system writes to the primary node.
It continues walking the ring clockwise to find the next $R-1$ distinct physical nodes.
It skips any virtual nodes that map back to a physical machine already holding a replica, ensuring data is distributed across physically isolated hardware (or even distinct availability zones).

Visual Example

Ring with replication factor 3:
┌─────────────────────────────────────────────────────────────┐
│ Physical Servers: A, B, C, D, E                              │
│                                                              │
│ Key "user_1" hashes to position 100                          │
│                                                              │
│ Primary: Server A (first clockwise from 100)                │
│ Replica 1: Server B (next distinct physical server)          │
│ Replica 2: Server C (next distinct physical server)          │
│                                                              │
│ Data stored on: A, B, C                                      │
└─────────────────────────────────────────────────────────────┘

Replication Strategies

Strategy 1: Clockwise Replication

Replicate to the next $R-1$ distinct physical nodes clockwise from the primary.

Pros:

Simple to implement
Deterministic placement
Easy to understand

Cons:

May not optimize for network topology
Replicas might be in the same availability zone

Strategy 2: Rack-Aware Replication

Place replicas on different racks or availability zones to improve fault tolerance.

Example:

def get_replicas(key, replication_factor, topology):
    primary = get_node(key)
    replicas = [primary]

    # Get nodes in different racks/zones
    available_nodes = [n for n in topology if n.rack != primary.rack]

    for i in range(replication_factor - 1):
        if i < len(available_nodes):
            replicas.append(available_nodes[i])
        else:
            # Fallback to any available node
            replicas.append(get_next_node(replicas[-1]))

    return replicas

Strategy 3: Snitch-Based Replication

Use a "snitch" to determine network topology and place replicas optimally.

Cassandra Example:

# cassandra.yaml
endpoint_snitch: GossipingPropertyFileSnitch

Consistency Levels

Different consistency levels trade off between availability and consistency:

Consistency Level	Description	Latency	Availability
ONE	Acknowledged by 1 replica	Lowest	Highest
QUORUM	Acknowledged by majority (R/2 + 1)	Medium	Medium
ALL	Acknowledged by all replicas	Highest	Lowest

Example:

def write_with_consistency(key, value, consistency_level):
    replicas = get_replicas(key, replication_factor=3)
    responses = []

    for replica in replicas:
        response = send_write(replica, key, value)
        responses.append(response)

        # Check if we've met the consistency requirement
        if len(responses) >= get_required_acks(consistency_level, len(replicas)):
            return True

    return False

Handling Node Failures

When a node fails, the system must:

Detect Failure: Use gossip protocol or health checks
Redirect Traffic: Route requests to the next available replica
Repair Data: Use read repair or anti-entropy to restore replication

Example:

def read_with_repair(key):
    replicas = get_replicas(key, replication_factor=3)
    results = []

    for replica in replicas:
        try:
            result = send_read(replica, key)
            results.append((replica, result))
        except NodeUnavailable:
            continue

    # Find the most recent version
    latest = max(results, key=lambda x: x[1].timestamp)

    # Repair stale replicas
    for replica, result in results:
        if result.timestamp < latest[1].timestamp:
            send_write(replica, key, latest[1].value)

    return latest[1].value

⚙️ 5. How Consistent Hashing Works Internally

Hash Function Selection

The choice of hash function significantly impacts performance and distribution:

Common Hash Functions

Hash Function	Output Size	Speed	Distribution	Use Case
MD5	128-bit	Fast	Good	General purpose
SHA-1	160-bit	Medium	Good	Security-focused
MurmurHash	32/64-bit	Very Fast	Excellent	High-performance
xxHash	64-bit	Very Fast	Excellent	High-performance
CityHash	64/128-bit	Very Fast	Excellent	High-performance

Recommendation: Use MurmurHash or xxHash for performance-critical applications.

Ring Data Structure

Option 1: Sorted Array with Binary Search

class ArrayBasedRing:
    def __init__(self):
        self.nodes = []  # List of (hash, node) tuples

    def add_node(self, node):
        hash_value = hash(node)
        bisect.insort(self.nodes, (hash_value, node))

    def get_node(self, key):
        hash_value = hash(key)
        idx = bisect.bisect_left(self.nodes, (hash_value,))
        if idx == len(self.nodes):
            idx = 0
        return self.nodes[idx][1]

Pros:

Simple implementation
Good cache locality
Fast for small rings

Cons:

$O(N)$ insertion time
Not ideal for frequent updates

Option 2: TreeMap (Red-Black Tree)

public class TreeMapRing {
    private final TreeMap<Long, String> ring = new TreeMap<>();

    public void addNode(String node) {
        long hash = hash(node);
        ring.put(hash, node);
    }

    public String getNode(String key) {
        long hash = hash(key);
        Map.Entry<Long, String> entry = ring.ceilingEntry(hash);
        if (entry == null) {
            entry = ring.firstEntry();
        }
        return entry.getValue();
    }
}

Pros:

$O(\log N)$ insertion and lookup
Good for dynamic rings
Built-in ceiling operation

Cons:

Higher memory overhead
Slower than array for small rings

Option 3: Jump Consistent Hash

A newer algorithm that provides $O(1)$ lookup without maintaining a ring structure:

def jump_consistent_hash(key, num_buckets):
    import hashlib
    hash_value = int(hashlib.md5(str(key).encode()).hexdigest(), 16)

    b = -1
    j = 0

    while j < num_buckets:
        b = j
        hash_value = (hash_value * 2862933555777941757 + 1) & 0xFFFFFFFFFFFFFFFF
        j = int((b + 1) * ((1 << 31) / ((hash_value >> 33) + 1)))

    return b

Pros:

$O(1)$ lookup time
No memory overhead
Excellent distribution

Cons:

Doesn't support weighted nodes
Harder to understand

Concurrency Considerations

Thread-Safe Implementation

import java.util.concurrent.ConcurrentSkipListMap;

public class ConcurrentConsistentHash {
    private final ConcurrentSkipListMap<Long, String> ring =
        new ConcurrentSkipListMap<>();

    public void addNode(String node) {
        long hash = hash(node);
        ring.put(hash, node);
    }

    public String getNode(String key) {
        long hash = hash(key);
        ConcurrentSkipListMap<Long, String> tail = ring.tailMap(hash);
        if (tail.isEmpty()) {
            return ring.firstEntry().getValue();
        }
        return tail.firstEntry().getValue();
    }
}

Read-Write Lock Pattern

import threading

class ThreadSafeRing:
    def __init__(self):
        self.ring = []
        self.lock = threading.RLock()

    def add_node(self, node):
        with self.lock:
            hash_value = hash(node)
            bisect.insort(self.ring, (hash_value, node))

    def get_node(self, key):
        with self.lock:
            hash_value = hash(key)
            idx = bisect.bisect_left(self.ring, (hash_value,))
            if idx == len(self.ring):
                idx = 0
            return self.ring[idx][1]

Memory Optimization

Compressed Ring Representation

class CompressedRing:
    def __init__(self):
        self.hashes = []  # Sorted list of hash values
        self.nodes = []  # Parallel list of node IDs

    def add_node(self, node):
        hash_value = hash(node)
        idx = bisect.bisect_left(self.hashes, hash_value)
        self.hashes.insert(idx, hash_value)
        self.nodes.insert(idx, node)

    def get_node(self, key):
        hash_value = hash(key)
        idx = bisect.bisect_left(self.hashes, hash_value)
        if idx == len(self.hashes):
            idx = 0
        return self.nodes[idx]

Bitmap-Based Ring (for small hash spaces)

class BitmapRing:
    def __init__(self, size=256):
        self.size = size
        self.bitmap = [None] * size

    def add_node(self, node):
        hash_value = hash(node) % self.size
        self.bitmap[hash_value] = node

    def get_node(self, key):
        hash_value = hash(key) % self.size
        for i in range(self.size):
            idx = (hash_value + i) % self.size
            if self.bitmap[idx] is not None:
                return self.bitmap[idx]
        return None

🌐 6. Real-World Implementations

DynamoDB (Amazon)

Key Features:

Consistent hashing with virtual nodes
Replication across multiple availability zones
Tunable consistency levels
Automatic partitioning and rebalancing

Architecture:

Request → Load Balancer → Router Node → Partition → Replica

Partitioning:

Uses consistent hashing to partition data across nodes
Each partition is assigned a range of hash keys
Partitions are automatically split when they grow too large

Cassandra

Key Features:

Consistent hashing with virtual nodes (tokens)
Rack-aware replication
Configurable consistency levels
Gossip protocol for cluster membership

Token Assignment:

# Calculate token for a node
nodetool ring

Replication Strategy:

# SimpleStrategy (single datacenter)
CREATE KEYSPACE my_keyspace
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};

# NetworkTopologyStrategy (multiple datacenters)
CREATE KEYSPACE my_keyspace
WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'dc1': 3,
  'dc2': 2
};

Redis Cluster

Key Features:

Hash slots (16384 slots)
Consistent hashing via hash slots
Automatic failover
Master-slave replication

Slot Assignment:

# Assign slots to nodes
redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
  127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
  --cluster-replicas 1

Hash Slot Calculation:

def get_slot(key):
    import hashlib
    # Extract hash tag if present (e.g., {user123})
    if '{' in key and '}' in key:
        key = key[key.index('{')+1:key.index('}')]
    return int(hashlib.crc16(key.encode())) & 16383

Memcached (Ketama)

Key Features:

Ketama consistent hashing algorithm
Virtual nodes for load balancing
Simple client-side implementation

Ketama Algorithm:

import hashlib

def ketama_hash(key):
    # MD5 hash of the key
    md5 = hashlib.md5(key.encode()).digest()

    # Use first 4 bytes as the hash value
    return int.from_bytes(md5[:4], byteorder='big')

Chord (Distributed Hash Table)

Key Features:

Consistent hashing for peer-to-peer networks
Finger tables for efficient lookups
Self-organizing and fault-tolerant

Finger Table:

Node n maintains finger table:
finger[i] = successor of (n + 2^i) mod 2^m

For m = 3 (8 nodes):
Node 0: [1, 2, 4]
Node 1: [2, 3, 5]
Node 2: [3, 4, 6]
...

🔗 7. Integration Patterns

Pattern 1: Client-Side Routing

The client determines which server to contact based on consistent hashing.

Pros:

No central coordinator needed
Low latency (direct client-server communication)
Scalable (clients can be numerous)

Cons:

Clients must implement consistent hashing
Harder to update routing logic
Potential for inconsistent client implementations

Example:

class ClientSideRouter:
    def __init__(self, servers):
        self.ring = ConsistentHash()
        for server in servers:
            self.ring.add_node(server)

    def get(self, key):
        server = self.ring.get_node(key)
        return send_request(server, "GET", key)

    def set(self, key, value):
        server = self.ring.get_node(key)
        return send_request(server, "SET", key, value)

Pattern 2: Proxy-Based Routing

A proxy layer handles consistent hashing and routes requests to appropriate servers.

Pros:

Centralized routing logic
Easier to update and maintain
Can implement additional features (caching, rate limiting)

Cons:

Single point of failure (if not highly available)
Additional network hop
Proxy must handle all traffic

Example:

class ProxyRouter:
    def __init__(self, backend_servers):
        self.ring = ConsistentHash()
        for server in backend_servers:
            self.ring.add_node(server)

    def handle_request(self, request):
        key = extract_key(request)
        server = self.ring.get_node(key)
        return forward_request(server, request)

Pattern 3: Service Discovery Integration

Combine consistent hashing with service discovery for dynamic server lists.

Example with Eureka:

class DynamicRing:
    def __init__(self, discovery_client):
        self.discovery_client = discovery_client
        self.ring = ConsistentHash()
        self.update_ring()

    def update_ring(self):
        services = self.discovery_client.get_services()
        self.ring = ConsistentHash()
        for service in services:
            for instance in service.instances:
                self.ring.add_node(instance.host)

    def get_node(self, key):
        return self.ring.get_node(key)

Pattern 4: Multi-Tenant Routing

Use consistent hashing to route requests for different tenants to different server pools.

Example:

class MultiTenantRouter:
    def __init__(self, tenant_configs):
        self.rings = {}
        for tenant, config in tenant_configs.items():
            self.rings[tenant] = ConsistentHash()
            for server in config['servers']:
                self.rings[tenant].add_node(server)

    def route(self, tenant, key):
        ring = self.rings.get(tenant)
        if ring:
            return ring.get_node(key)
        return None

Pattern 5: Hotspot Detection and Mitigation

Detect hotspots and redistribute load dynamically.

Example:

class HotspotAwareRing:
    def __init__(self):
        self.ring = ConsistentHash()
        self.load_stats = {}

    def get_node(self, key):
        node = self.ring.get_node(key)

        # Check if node is overloaded
        if self.is_overloaded(node):
            # Find alternative node
            alternative = self.find_alternative(key, node)
            if alternative:
                return alternative

        return node

    def is_overloaded(self, node):
        stats = self.load_stats.get(node, {})
        return stats.get('requests_per_second', 0) > 10000

⚖️ 8. Pros and Cons

Pros

✅ Minimal Data Movement: Adding or removing nodes only affects a small subset of keys, minimizing data migration.

✅ Scalability: Can handle thousands of nodes without significant performance degradation.

✅ Load Balancing: With virtual nodes, provides excellent load distribution across the cluster.

✅ Decentralization: No central coordinator needed, reducing single points of failure.

✅ Fault Tolerance: Natural support for replication and high availability.

✅ Flexibility: Can be adapted for various use cases (caching, databases, load balancing).

✅ Deterministic: Same key always routes to the same node (unless topology changes).

Cons

❌ Complexity: More complex to implement than simple modulo hashing.

❌ Memory Overhead: Maintaining the ring structure requires additional memory.

❌ Lookup Overhead: $O(\log N)$ lookup time vs. $O(1)$ for modulo hashing.

❌ Cold Start: Initial cluster setup requires careful planning of virtual node placement.

❌ Uneven Distribution: Without virtual nodes, can suffer from significant data skew.

❌ Network Partitions: During network partitions, different parts of the cluster may have different views of the ring.

❌ Rebalancing Complexity: Rebalancing after node failures requires careful coordination.

When to Use Consistent Hashing

Use Consistent Hashing when:

Building a distributed cache (Redis, Memcached)
Designing a partitioned database (Cassandra, DynamoDB)
Implementing a load balancer for stateful services
Building a distributed file system
Designing a message queue with partitioned topics

Consider alternatives when:

Building a simple, small-scale application
Using a managed service that handles partitioning automatically
Performance is critical and lookup overhead is unacceptable
The system has a fixed, small number of nodes

🎓 9. Interview Questions

Beginner Level

Q1: What is the problem with using modulo hashing for distributed systems? A: Modulo hashing causes massive data redistribution when the number of nodes changes. When adding or removing a node, almost all keys need to be remapped, causing cache misses, database overload, and potential system failure.

Q2: What is consistent hashing? A: Consistent hashing is a technique that maps data to nodes in a way that minimizes data movement when nodes are added or removed. It uses a circular hash ring where both nodes and data keys are placed, and data is assigned to the first node encountered when moving clockwise from the key's position.

Q3: How does consistent hashing solve the data redistribution problem? A: In consistent hashing, adding or removing a node only affects keys that hash to positions near that node. Most keys remain assigned to their original nodes, minimizing data movement and system disruption.

Q4: What is a hash ring? A: A hash ring is a circular data structure representing the hash space. Both server nodes and data keys are mapped to positions on this ring using a hash function. Data is assigned to the first server encountered when moving clockwise from the key's position.

Q5: What is the time complexity of looking up a node in consistent hashing? A: The time complexity is $O(\log N)$ where $N$ is the number of nodes, assuming the ring is implemented as a balanced binary search tree or sorted array with binary search.

Intermediate Level

Q6: What are virtual nodes and why are they needed? A: Virtual nodes are multiple representations of a single physical node on the hash ring. They're needed to ensure even load distribution, handle heterogeneous hardware (different server capacities), and prevent hotspots. Without virtual nodes, a small number of physical nodes would result in uneven data distribution.

Q7: How do you choose the number of virtual nodes per physical node? A: The number depends on the cluster size and desired load balancing. For small clusters (< 10 nodes), use 100-200 virtual nodes per physical node. For medium clusters (10-100 nodes), use 50-100. For large clusters (> 100 nodes), use 10-50. More virtual nodes provide better distribution but increase memory usage and lookup time.

Q8: How does replication work with consistent hashing? A: To replicate data, the system finds the primary node by walking clockwise from the key's position. It then continues walking to find the next $R-1$ distinct physical nodes, where $R$ is the replication factor. This ensures replicas are distributed across different physical machines.

Q9: What happens when a node fails in a consistent hashing system? A: When a node fails, its assigned data "falls forward" to the next node clockwise on the ring. The system detects the failure through health checks or gossip protocol, redirects traffic to the next available node, and uses read repair or anti-entropy to restore replication.

Q10: How do you implement weighted load balancing with consistent hashing? A: Assign different numbers of virtual nodes to each physical server based on their capacity. A more powerful server might have 500 virtual nodes while a smaller server has 100. This ensures that more powerful servers handle proportionally more traffic.

Advanced Level

Q11: Compare different hash functions for consistent hashing. A: MD5 and SHA-1 provide good distribution but are slower. MurmurHash and xxHash are very fast with excellent distribution, making them ideal for performance-critical applications. The choice depends on the specific requirements: speed vs. cryptographic security.

Q12: How do you handle network partitions in consistent hashing? A: During network partitions, different parts of the cluster may have different views of the ring. Use gossip protocols to eventually converge on a consistent view. Implement quorum-based operations to ensure consistency. Use version vectors or vector clocks to detect and resolve conflicts.

Q13: What is Jump Consistent Hash and how does it differ from traditional consistent hashing? A: Jump Consistent Hash is a newer algorithm that provides $O(1)$ lookup time without maintaining a ring structure. It uses a mathematical formula to determine the bucket for a given key. It's faster and uses less memory but doesn't support weighted nodes or complex replication strategies.

Q14: How do you implement hot spot detection and mitigation in consistent hashing? A: Monitor request rates and latency for each node. Detect hotspots when a node's load exceeds a threshold. Mitigate by temporarily redirecting traffic to less loaded nodes, adding more virtual nodes for overloaded servers, or implementing adaptive load balancing algorithms.

Q15: How do you integrate consistent hashing with service discovery? A: Combine consistent hashing with service discovery by dynamically updating the ring when services are registered or deregistered. Use a service discovery system like Eureka or Consul to get the current list of available services, then update the consistent hash ring accordingly. Implement periodic synchronization to handle network partitions.

Senior Level

Q16: Design a distributed cache system using consistent hashing. A: Implement client-side consistent hashing with virtual nodes for load balancing. Use a replication factor of 3 for high availability. Implement write-through and write-back caching strategies. Use gossip protocol for cluster membership. Implement cache invalidation and eviction policies. Monitor cache hit rates and adjust virtual node counts dynamically.

Q17: How do you handle data rebalancing during cluster scaling? A: Implement gradual data migration to avoid overwhelming the network. Use a two-phase approach: first add new nodes to the ring, then gradually migrate data. Implement throttling to control migration speed. Use checksums to verify data integrity. Support rollback in case of failures. Monitor system health during migration.

Q18: How do you implement multi-datacenter replication with consistent hashing? A: Use rack-aware or zone-aware replication strategies. Place replicas in different datacenters for disaster recovery. Implement cross-datacenter synchronization with conflict resolution. Use quorum-based consistency levels to balance latency and consistency. Implement data locality optimizations for read-heavy workloads.

Q19: How do you optimize consistent hashing for high-throughput systems? A: Use fast hash functions like MurmurHash or xxHash. Implement lock-free data structures for concurrent access. Use memory-mapped files for large rings. Implement batch operations for bulk requests. Use CPU cache-friendly data structures. Profile and optimize hot paths. Consider hardware acceleration (FPGA, GPU) for hash computation.

Q20: How do you debug and troubleshoot consistent hashing issues? A: Implement comprehensive logging and monitoring. Track key distribution and node load. Visualize the ring topology. Implement health checks and alerts. Use distributed tracing to follow request flows. Implement canary deployments for testing changes. Use chaos engineering to test failure scenarios. Document known issues and solutions.

🧠 10. Senior Deep Dive: Advanced Topics

Topic 1: Consistency Models

Eventual Consistency

In systems using consistent hashing, eventual consistency is often the default. Updates propagate asynchronously, and different nodes may temporarily have different views of the data.

Implementation:

class EventualConsistentStore:
    def __init__(self):
        self.data = {}
        self.vector_clocks = {}

    def write(self, key, value, node_id):
        if key not in self.vector_clocks:
            self.vector_clocks[key] = VectorClock()

        self.vector_clocks[key].increment(node_id)
        self.data[key] = (value, self.vector_clocks[key].copy())

    def read(self, key):
        if key in self.data:
            return self.data[key]
        return None

    def merge(self, remote_data):
        for key, (value, clock) in remote_data.items():
            if key not in self.vector_clocks:
                self.data[key] = (value, clock)
                self.vector_clocks[key] = clock
            elif clock > self.vector_clocks[key]:
                self.data[key] = (value, clock)
                self.vector_clocks[key] = clock

Strong Consistency

For strong consistency, implement quorum-based operations:

class StrongConsistentStore:
    def __init__(self, replication_factor=3):
        self.replication_factor = replication_factor
        self.ring = ConsistentHash()

    def write(self, key, value):
        replicas = self.get_replicas(key)
        acks = 0

        for replica in replicas:
            try:
                replica.write(key, value)
                acks += 1
                if acks >= self.quorum():
                    return True
            except Exception:
                continue

        return False

    def read(self, key):
        replicas = self.get_replicas(key)
        results = []

        for replica in replicas:
            try:
                result = replica.read(key)
                results.append(result)
                if len(results) >= self.quorum():
                    return self.resolve_conflicts(results)
            except Exception:
                continue

        return None

    def quorum(self):
        return (self.replication_factor // 2) + 1

Topic 2: Failure Detection

Gossip Protocol

Implement a gossip protocol for failure detection and cluster membership:

import random
import time

class GossipProtocol:
    def __init__(self, node_id, peers):
        self.node_id = node_id
        self.peers = peers
        self.heartbeat = time.time()
        self.failure_detector = FailureDetector()

    def gossip(self):
        # Select random peers to gossip with
        sample = random.sample(self.peers, min(3, len(self.peers)))

        for peer in sample:
            try:
                peer.receive_heartbeat(self.node_id, self.heartbeat)
                peer.receive_member_list(self.get_member_list())
            except Exception:
                self.failure_detector.record_failure(peer)

    def receive_heartbeat(self, node_id, timestamp):
        self.failure_detector.record_heartbeat(node_id, timestamp)

    def get_member_list(self):
        # Return list of live nodes
        return [p for p in self.peers if self.failure_detector.is_alive(p)]

Phi Accrual Failure Detector

Use the Phi Accrual Failure Detector for more accurate failure detection:

class PhiAccrualFailureDetector:
    def __init__(self, threshold=8):
        self.threshold = threshold
       .heartbeat_history = {}

    def record_heartbeat(self, node_id, timestamp):
        if node_id not in self.heartbeat_history:
            self.heartbeat_history[node_id] = []

        self.heartbeat_history[node_id].append(timestamp)
        # Keep only recent heartbeats
        if len(self.heartbeat_history[node_id]) > 1000:
            self.heartbeat_history[node_id] = self.heartbeat_history[node_id][-1000:]

    def is_alive(self, node_id):
        if node_id not in self.heartbeat_history:
            return False

        history = self.heartbeat_history[node_id]
        if len(history) < 2:
            return True

        # Calculate inter-arrival times
        intervals = [history[i] - history[i-1] for i in range(1, len(history))]

        # Calculate mean and standard deviation
        mean = sum(intervals) / len(intervals)
        std = (sum((x - mean) ** 2 for x in intervals) / len(intervals)) ** 0.5

        # Calculate phi
        if std == 0:
            return True

        last_heartbeat = history[-1]
        time_since_last = time.time() - last_heartbeat
        phi = (time_since_last - mean) / std

        return phi < self.threshold

Topic 3: Data Migration Strategies

Online Migration

Implement online migration without downtime:

class OnlineMigrator:
    def __init__(self, old_ring, new_ring):
        self.old_ring = old_ring
        self.new_ring = new_ring
        self.migration_progress = {}

    def migrate_key(self, key):
        old_node = self.old_ring.get_node(key)
        new_node = self.new_ring.get_node(key)

        if old_node != new_node:
            # Migrate data from old to new node
            data = old_node.read(key)
            new_node.write(key, data)
            self.migration_progress[key] = new_node

            # Optionally delete from old node
            # old_node.delete(key)

    def get_node(self, key):
        # Check if key has been migrated
        if key in self.migration_progress:
            return self.migration_progress[key]

        # Use old ring for unmigrated keys
        return self.old_ring.get_node(key)

    def get_migration_status(self):
        total_keys = len(self.migration_progress)
        estimated_total = estimate_total_keys()
        return total_keys / estimated_total

Phased Migration

Implement phased migration for large datasets:

class PhasedMigrator:
    def __init__(self, phases):
        self.phases = phases
        self.current_phase = 0

    def migrate(self):
        for phase in self.phases:
            self.current_phase += 1
            self.execute_phase(phase)

            # Verify phase completion
            if not self.verify_phase(phase):
                raise MigrationError(f"Phase {self.current_phase} failed")

            # Wait before next phase
            time.sleep(phase.wait_time)

    def execute_phase(self, phase):
        for key_range in phase.key_ranges:
            for key in get_keys_in_range(key_range):
                self.migrate_key(key)

    def verify_phase(self, phase):
        # Verify data integrity
        for key_range in phase.key_ranges:
            for key in get_keys_in_range(key_range):
                if not self.verify_key(key):
                    return False
        return True

Topic 4: Performance Optimization

Lock-Free Implementation

Implement lock-free consistent hashing:

import java.util.concurrent.atomic.AtomicReference;

public class LockFreeRing {
    private final AtomicReference<TreeMap<Long, String>> ringRef =
        new AtomicReference<>(new TreeMap<>());

    public void addNode(String node) {
        while (true) {
            TreeMap<Long, String> oldRing = ringRef.get();
            TreeMap<Long, String> newRing = new TreeMap<>(oldRing);

            long hash = hash(node);
            newRing.put(hash, node);

            if (ringRef.compareAndSet(oldRing, newRing)) {
                return;
            }
        }
    }

    public String getNode(String key) {
        TreeMap<Long, String> ring = ringRef.get();
        long hash = hash(key);

        Map.Entry<Long, String> entry = ring.ceilingEntry(hash);
        if (entry == null) {
            entry = ring.firstEntry();
        }

        return entry.getValue();
    }
}

SIMD Optimization

Use SIMD instructions for hash computation:

#include <immintrin.h>

void simd_hash(const uint8_t* data, size_t len, uint64_t* hash) {
    __m256i accumulator = _mm256_setzero_si256();

    // Process 32 bytes at a time
    for (size_t i = 0; i + 32 <= len; i += 32) {
        __m256i chunk = _mm256_loadu_si256((__m256i*)(data + i));
        accumulator = _mm256_xor_si256(accumulator, chunk);
    }

    // Reduce to single hash value
    uint64_t result[4];
    _mm256_storeu_si256((__m256i*)result, accumulator);

    *hash = result[0] ^ result[1] ^ result[2] ^ result[3];
}

Topic 5: Security Considerations

Hash Collision Attacks

Protect against hash collision attacks:

import hashlib
import hmac

class SecureHashRing:
    def __init__(self, secret_key):
        self.secret_key = secret_key

    def hash(self, key):
        # Use HMAC to prevent hash collision attacks
        return int(hashlib.sha256(
            hmac.new(self.secret_key, key.encode(), hashlib.sha256).digest()
        ).hexdigest(), 16)

    def get_node(self, key):
        # Add random salt to prevent timing attacks
        salt = os.urandom(16)
        hash_value = self.hash(key + salt.hex())
        return self.ring.get_node(hash_value)

Access Control

Implement access control for ring operations:

class SecureRing:
    def __init__(self):
        self.ring = ConsistentHash()
        self.acl = AccessControlList()

    def add_node(self, node, user):
        if not self.acl.check_permission(user, "ring:add_node"):
            raise PermissionError("User not authorized to add nodes")

        self.ring.add_node(node)

    def remove_node(self, node, user):
        if not self.acl.check_permission(user, "ring:remove_node"):
            raise PermissionError("User not authorized to remove nodes")

        self.ring.remove_node(node)

📖 Additional Resources

Academic Papers

"Consistent Hashing and Random Trees" by Karger et al. (1997)
"Dynamo: Amazon's Highly Available Key-value Store" by DeCandia et al. (2007)
"The Chord Distributed Lookup System" by Stoica et al. (2001)

Books

"Designing Data-Intensive Applications" by Martin Kleppmann
"Distributed Systems: Principles and Paradigms" by Tanenbaum and Van Steen
"System Design Interview" by Alex Xu

Open Source Projects

Hashicorp Consul - Service discovery with consistent hashing
Apache Cassandra - Distributed database with consistent hashing
Redis Cluster - Distributed Redis with hash slots

Online Resources

Consistent Hashing Explained
System Design Primer
The Morning Paper - Summaries of CS papers

🎯 Best Practices

Start Simple: Begin with basic consistent hashing, add complexity as needed
Monitor Everything: Track key distribution, node load, and system health
Test Thoroughly: Use chaos engineering to test failure scenarios
Document Decisions: Maintain architecture decision records (ADRs)
Plan for Failure: Design for node failures, network partitions, and data corruption
Use Virtual Nodes: Always use virtual nodes for better load distribution
Implement Backpressure: Protect systems from overload during failures
Gradual Rollouts: Use canary deployments for major changes
Automate Operations: Automate node addition, removal, and rebalancing
Security First: Implement authentication, authorization, and encryption

📚 Table of Contents​

🚧 1. The Bottleneck: Naive Modulo Hashing​

The Naive Approach​

Code Example​

The Mass Redistribution Cascade​

Visual Example​

Impact on Production Systems​

✅ 2. The Solution: Consistent Hashing​

2.1 The Hash Ring Mechanics​

Concept Overview​

Visual Representation​

2.2 Time Complexity and Implementation​

Implementation Options​

Performance Characteristics​

2.3 Graceful Scaling​

Adding a Node​

Removing a Node​

2.4 Mathematical Properties​

Consistency Property​

Load Distribution​

⚖️ 3. Resolving Data Skew: Virtual Nodes​

The Problem with Physical Nodes Only​

The Virtual Nodes (V-Nodes) Pattern​

How Virtual Nodes Work​

Benefits of Virtual Nodes​

Choosing the Number of Virtual Nodes​

Implementation Example​

🔄 4. High Availability: Replication on the Ring​

Replication Strategy​

Visual Example​

Replication Strategies​

Strategy 1: Clockwise Replication​

Strategy 2: Rack-Aware Replication​

Strategy 3: Snitch-Based Replication​

Consistency Levels​

Handling Node Failures​

⚙️ 5. How Consistent Hashing Works Internally​

Hash Function Selection​

Common Hash Functions​

Ring Data Structure​

Option 1: Sorted Array with Binary Search​

Option 2: TreeMap (Red-Black Tree)​

Option 3: Jump Consistent Hash​

Concurrency Considerations​

Thread-Safe Implementation​

Read-Write Lock Pattern​

Memory Optimization​

Compressed Ring Representation​

Bitmap-Based Ring (for small hash spaces)​

🌐 6. Real-World Implementations​

DynamoDB (Amazon)​

Cassandra​

Redis Cluster​

Memcached (Ketama)​

Chord (Distributed Hash Table)​

🔗 7. Integration Patterns​

Pattern 1: Client-Side Routing​

Pattern 2: Proxy-Based Routing​

Pattern 3: Service Discovery Integration​

Pattern 4: Multi-Tenant Routing​

Pattern 5: Hotspot Detection and Mitigation​

⚖️ 8. Pros and Cons​

Pros​

Cons​

When to Use Consistent Hashing​

🎓 9. Interview Questions​

Beginner Level​

Intermediate Level​

Advanced Level​

Senior Level​

🧠 10. Senior Deep Dive: Advanced Topics​

Topic 1: Consistency Models​

Eventual Consistency​

Strong Consistency​

Topic 2: Failure Detection​

Gossip Protocol​

Phi Accrual Failure Detector​

Topic 3: Data Migration Strategies​

Online Migration​

Phased Migration​

📚 Table of Contents

🚧 1. The Bottleneck: Naive Modulo Hashing

The Naive Approach

Code Example

The Mass Redistribution Cascade

Visual Example

Impact on Production Systems

✅ 2. The Solution: Consistent Hashing

2.1 The Hash Ring Mechanics

Concept Overview

Visual Representation

2.2 Time Complexity and Implementation

Implementation Options

Performance Characteristics

2.3 Graceful Scaling

Adding a Node

Removing a Node

2.4 Mathematical Properties

Consistency Property

Load Distribution

⚖️ 3. Resolving Data Skew: Virtual Nodes

The Problem with Physical Nodes Only

The Virtual Nodes (V-Nodes) Pattern

How Virtual Nodes Work

Benefits of Virtual Nodes

Choosing the Number of Virtual Nodes

Implementation Example

🔄 4. High Availability: Replication on the Ring

Replication Strategy

Visual Example

Replication Strategies

Strategy 1: Clockwise Replication

Strategy 2: Rack-Aware Replication

Strategy 3: Snitch-Based Replication

Consistency Levels

Handling Node Failures

⚙️ 5. How Consistent Hashing Works Internally

Hash Function Selection

Common Hash Functions

Ring Data Structure

Option 1: Sorted Array with Binary Search

Option 2: TreeMap (Red-Black Tree)

Option 3: Jump Consistent Hash

Concurrency Considerations

Thread-Safe Implementation

Read-Write Lock Pattern

Memory Optimization

Compressed Ring Representation

Bitmap-Based Ring (for small hash spaces)

🌐 6. Real-World Implementations

DynamoDB (Amazon)

Cassandra

Redis Cluster

Memcached (Ketama)

Chord (Distributed Hash Table)

🔗 7. Integration Patterns

Pattern 1: Client-Side Routing

Pattern 2: Proxy-Based Routing

Pattern 3: Service Discovery Integration

Pattern 4: Multi-Tenant Routing

Pattern 5: Hotspot Detection and Mitigation

⚖️ 8. Pros and Cons

Pros

Cons

When to Use Consistent Hashing

🎓 9. Interview Questions

Beginner Level

Intermediate Level

Advanced Level

Senior Level

🧠 10. Senior Deep Dive: Advanced Topics

Topic 1: Consistency Models

Eventual Consistency

Strong Consistency

Topic 2: Failure Detection

Gossip Protocol

Phi Accrual Failure Detector

Topic 3: Data Migration Strategies

Online Migration

Phased Migration