Member-only story
How Consistent Hashing Is Used by Load Balancers to Distribute Requests
Load balancers and consistent hashing in six minutes
Vertical vs. Horizontal Scaling
In a monolithic architecture, clients typically make requests to one single server. As the number of requests starts to scale, the single server does not have sufficient capacity to serve all the incoming requests.
Vertical scaling could be an option, where more CPU/RAM is added to the servers. This option could work for only so long before the hardware limitations are encountered.
In most cases, horizontal scaling, in which more servers are added, is usually a more scalable alternative.

Redirecting Requests With a Load Balancer
When we scale horizontally, the requests are directed to the load balancer instead of the servers directly.
The load balancer’s job is exactly what its name describes: its purpose is to balance the load on each server by distributing the requests as uniformly as possible.
Hash function and modulo (%)
All incoming requests, which will have a unique identifier (e.g. IP address), are assumed to be uniformly random.
Using a hash function, we are able to obtain an output value, after which we apply the modulo function to get the number that corresponds to the server that the load balancer should be directing the request to.
- hash(ipAddress) → output
- Output % number of servers -1 → server ID
It is important to use a good hash function to ensure that the output values are spread out across a range of values to improve the randomness. The modulo function then guarantees that the server ID is in the range of 0.(Number of servers -1.)
Visualizing the mapping
Let’s take a step back to visualize how we could possibly use an array as a data structure to map each request to…