You have 3 cache servers. You use server = hash(key) % 3 to decide which server stores each key.
You add a 4th server. Now hash(key) % 4 gives different results. Every cached key maps to a new server. Your cache hit rate drops from 90% to near 0%. Your database collapses.
This is why simple modular hashing breaks when you scale. Consistent Hashing is the solution used by Redis Cluster, Cassandra, DynamoDB, and every CDN.
Part 1: Foundations (The Mental Model)
Simple Hashing = The School Locker Assignment
Imagine 1,000 students and 10 lockers. locker = student_id % 10.
Simple. Uniform. Works perfectly.
Problem: A new locker (server) is added. Now student_id % 11. Almost every student gets a different locker. Everyone is locked out of their old locker. Mass confusion.
Consistent Hashing = The Clock/Ring
Instead of % servers, imagine a giant clock ring from 0 to 360 degrees (or 0 to 2^32).
- Place servers on the ring: Hash each server’s name → position on the ring.
- Place keys on the ring: Hash each key → position on the ring.
- Find the server: Walk clockwise from the key. The first server you hit owns that key.
| |
When a server is added/removed, only the keys between it and its predecessor on the ring are affected. Everything else stays put.
Old way: Add server → ~100% of keys remapped. Consistent Hashing: Add server → ~1/N keys remapped (where N = number of servers).
Part 2: The Investigation (Virtual Nodes)
A naive ring has one problem: uneven distribution. By chance, one server might cover 1° and another might cover 180°.
Virtual Nodes solve this. Instead of each server having 1 position, it has 100+ positions on the ring (with different hash seeds).
| |
Each key still walks clockwise to find its server. But now each physical server handles a well-distributed portion of the ring — even if servers have different capacities (a bigger server gets more virtual nodes).
Part 3: The Diagnosis (Common Problems)
| Problem | Cause | Fix |
|---|---|---|
| Uneven load | Few servers, bad hash collisions | Add virtual nodes (100-150 per server) |
| Hot spots | One key gets 90% of traffic | Shard the key: user:123:shard_{0-9} |
| Server fails, keys lost | No replication | Replicate to N servers clockwise on the ring |
| Stale reads after topology change | Client still has old ring state | Use gossip protocol (Cassandra) or centralized config (Zookeeper) |
Part 4: The Resolution (Where Consistent Hashing Is Used)
1. Redis Cluster
Redis Cluster uses hash slots (0–16383) — a simplified consistent hashing variant.
| |
2. Cassandra
Uses a full consistent hash ring with virtual nodes (“vnodes”). Each node’s token range determines which rows it stores.
3. Your Own Load Balancer (Python)
| |
Final Mental Model
| |
When you need it:
- Distributed caches (Redis Cluster, Memcached).
- Distributed databases (Cassandra, DynamoDB).
- Load balancing stateful connections (sticky sessions without a lookup table).
