Trusted answers to developer questions

What is consistent hashing?

Free System Design Interview Course

Many candidates are rejected or down-leveled due to poor performance in their System Design Interview. Stand out in System Design Interviews and get hired in 2024 with this popular free course.

svg viewer

Problem

Consider a problem, where a set of records needs to be assigned to nn number of servers. An easy way of doing that is to evenly distribute records in each server. For that, we can use mod operation. For example, for a record rr we can get the server by r mod n formula.

Now the issue is if we want to add a new server or if we want to delete a server then nearly all the record keys to be remapped. This brings a great deal of overhead.

Consistent Hashing

In order to solve the above problem, consistent hashing is used because in this technique on average r/nr/n needs to be remapped, where rr is the number of records and nn is the number of servers slots. It should be done when:

  1. There are a number of servers that need to be scaled up or down depending upon the load.
  2. There are cache servers that need to be scaled.

There are many benefits to using consistent hashing:

  • Scalability
  • Load distribution
  • Quick replication and partitioning of data
  • Faster retrieval of keys as each server holds a limited number of keys

The key idea behind consistent hashing is that every record and server is mapped on the unit circle. Each record is then assigned to the first server that appears on the circle in a clockwise direction. This brings even distribution of records.

Here, if a new server is added to the unit circle then the records next to the server need to be updated whereas all the other records maintain the previous assignments. Similarly, when a server is removed from the unit circle, only records that are associated with it needs to be updated.

RELATED TAGS

hashing
Copyright ©2024 Educative, Inc. All rights reserved
Did you find this helpful?