Featured image of post Consistent Hashing: The 'Locker Room' Mental Model

Consistent Hashing: The 'Locker Room' Mental Model

How does Cassandra know which server stores your data? A mastery guide to consistent hashing, virtual nodes, and why your cache doesn't invalidate when a server is added.

You have 3 cache servers. You use server = hash(key) % 3 to decide which server stores each key.

You add a 4th server. Now hash(key) % 4 gives different results. Every cached key maps to a new server. Your cache hit rate drops from 90% to near 0%. Your database collapses.

This is why simple modular hashing breaks when you scale. Consistent Hashing is the solution used by Redis Cluster, Cassandra, DynamoDB, and every CDN.


Part 1: Foundations (The Mental Model)

Simple Hashing = The School Locker Assignment

Imagine 1,000 students and 10 lockers. locker = student_id % 10.

Simple. Uniform. Works perfectly.

Problem: A new locker (server) is added. Now student_id % 11. Almost every student gets a different locker. Everyone is locked out of their old locker. Mass confusion.

Consistent Hashing = The Clock/Ring

Instead of % servers, imagine a giant clock ring from 0 to 360 degrees (or 0 to 2^32).

  1. Place servers on the ring: Hash each server’s name → position on the ring.
  2. Place keys on the ring: Hash each key → position on the ring.
  3. Find the server: Walk clockwise from the key. The first server you hit owns that key.
1
2
3
4
5
6
7
        0° (Server A)
       /
360°--Ring--90°
       \        (Server B)
       180° (Server C)
        
Key "user:123" hashes to 45° → Walk clockwise → First server = Server A (at 0°/360°, wraps)

When a server is added/removed, only the keys between it and its predecessor on the ring are affected. Everything else stays put.

Old way: Add server → ~100% of keys remapped. Consistent Hashing: Add server → ~1/N keys remapped (where N = number of servers).


Part 2: The Investigation (Virtual Nodes)

A naive ring has one problem: uneven distribution. By chance, one server might cover 1° and another might cover 180°.

Virtual Nodes solve this. Instead of each server having 1 position, it has 100+ positions on the ring (with different hash seeds).

1
2
3
4
Physical Servers: A, B, C
Virtual Nodes:
  Ring: A₁ B₂ C₃ A₄ B₅ C₆ A₇ B₈ C₉ A₁₀ ...
        (100 virtual nodes each, evenly distributed)

Each key still walks clockwise to find its server. But now each physical server handles a well-distributed portion of the ring — even if servers have different capacities (a bigger server gets more virtual nodes).


Part 3: The Diagnosis (Common Problems)

ProblemCauseFix
Uneven loadFew servers, bad hash collisionsAdd virtual nodes (100-150 per server)
Hot spotsOne key gets 90% of trafficShard the key: user:123:shard_{0-9}
Server fails, keys lostNo replicationReplicate to N servers clockwise on the ring
Stale reads after topology changeClient still has old ring stateUse gossip protocol (Cassandra) or centralized config (Zookeeper)

Part 4: The Resolution (Where Consistent Hashing Is Used)

1. Redis Cluster

Redis Cluster uses hash slots (0–16383) — a simplified consistent hashing variant.

1
2
redis-cli cluster info
redis-cli cluster nodes  # See which server owns which hash slots

2. Cassandra

Uses a full consistent hash ring with virtual nodes (“vnodes”). Each node’s token range determines which rows it stores.

3. Your Own Load Balancer (Python)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, servers: list, replicas: int = 150):
        self.ring = {}
        self.sorted_keys = []
        
        for server in servers:
            for i in range(replicas):
                # Create 150 virtual nodes per server
                virtual_key = f"{server}:{i}"
                h = int(hashlib.md5(virtual_key.encode()).hexdigest(), 16)
                self.ring[h] = server
                self.sorted_keys.append(h)
        
        self.sorted_keys.sort()
    
    def get_server(self, key: str) -> str:
        h = int(hashlib.md5(key.encode()).hexdigest(), 16)
        idx = bisect_right(self.sorted_keys, h) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]


# Usage
ring = ConsistentHashRing(["server-a", "server-b", "server-c"])
print(ring.get_server("user:123"))  # → "server-b"
print(ring.get_server("user:456"))  # → "server-a"

# Add a new server
ring2 = ConsistentHashRing(["server-a", "server-b", "server-c", "server-d"])
# Only ~25% of keys change their server!

Final Mental Model

1
2
3
4
5
6
Simple Hashing (% N)    -> School lockers. Add one locker: everyone loses their stuff.
Consistent Hashing      -> Clock Ring. Add a server: only nearby keys are affected.
Virtual Nodes           -> Each server on the ring 150 times. Evenly distributed load.

hash(key) % N           -> 100% remapped when N changes.
Consistent Hashing      -> ~1/N remapped when N changes. (Critical for large caches).

When you need it:

  • Distributed caches (Redis Cluster, Memcached).
  • Distributed databases (Cassandra, DynamoDB).
  • Load balancing stateful connections (sticky sessions without a lookup table).
Made with laziness love 🦥

Subscribe to My Newsletter