System Design
What is consistent hashing and why is it useful?
Consistent hashing distributes keys across servers in a way that minimizes remapping when servers are added or removed. It is useful for caches, sharding, load balancing, and routing users to stable backend nodes.
The Short Answer
Consistent hashing is a way to assign keys to servers so that adding or removing servers causes minimal remapping.
It is commonly used in distributed caches, sharded databases, load balancing, and systems where requests for the same key should usually go to the same backend node.
The Real Problem It Solves
Suppose you have 3 user profile servers and you route users with simple modulo hashing:
serverIndex = hash(userId) % numberOfServers;This works until the number of servers changes.
If you go from 3 servers to 4 servers, the formula changes from:
hash(userId) % 3
to
hash(userId) % 4That tiny change can remap a huge percentage of users to different servers.
Real Example: DSP User Profile Routing
In a DSP or ad-serving system, user profile data is often hot and latency-sensitive. A bid request arrives with a user identifier, and the system needs to fetch or update that user's profile quickly.
A useful routing goal is:
That improves cache locality. If user 123 usually routes to Profile Server B, then Server B is more likely to already have that user's segments, frequency caps, recent activity, or profile state warmed in memory.
Good Locality
Bad Locality
Consistent hashing helps maintain this stable routing while still allowing the cluster to scale up, scale down, or handle node failure.
The Mental Model: Hash Ring
Imagine a circle. Both servers and keys are hashed onto positions on that circle.
To find the server for a key, move clockwise on the ring until you hit the next server.
hash(userId)
↓
position on ring
↓
walk clockwise
↓
first server found handles that userWhat Happens When a Server Is Added?
With modulo hashing, adding a server changes the modulo number and many keys may move.
With consistent hashing, only keys that fall near the new server's location on the ring usually move to that new server.
Modulo Hashing
Consistent Hashing
What Happens When a Server Fails?
If a server fails, only keys that were assigned to that failed server need to move. They typically move clockwise to the next available server.
This limits the blast radius of a failure.
Why Virtual Nodes Matter
A basic hash ring can be uneven. One physical server may accidentally own a much larger portion of the ring than another.
To improve balance, systems often use virtual nodes. Instead of placing each physical server once on the ring, place it many times.
Without Virtual Nodes
With Virtual Nodes
Physical server:
ProfileServer-B
Virtual nodes:
ProfileServer-B#1
ProfileServer-B#2
ProfileServer-B#3
ProfileServer-B#4Where Consistent Hashing Is Useful
- Distributed caches
- Database sharding
- CDNs and edge routing
- DSP user profile routing
- Session affinity
- Distributed key-value stores
- Load balancing where key locality matters
When Not to Use It
Consistent hashing is not always necessary. If any server can handle any request equally well, normal load balancing may be simpler.
It becomes useful when the key-to-server relationship matters: cached data, user affinity, shard ownership, or minimizing movement during cluster changes.
The Interview-Friendly Explanation
Common Interview Follow-Ups
Why is hash(key) % N a problem?
Because when N changes, many keys can map to different servers. That causes cache misses, data movement, and traffic instability.
Does consistent hashing mean no keys move?
No. Some keys still move when servers are added or removed. The benefit is that far fewer keys move compared with modulo hashing.
What are virtual nodes?
Virtual nodes are multiple positions on the hash ring for the same physical server. They help distribute load more evenly and reduce hot spots.
Where would you use consistent hashing in adtech?
One use case is routing all requests for a particular user ID to the same user profile server so the profile data stays warm and lookups remain low-latency.
Is consistent hashing a replacement for replication?
No. Consistent hashing decides ownership or routing. Replication is still needed when you want redundancy, failover, or higher availability.