Ensure Scalability and Replication
Explore methods to ensure key-value store scalability and high availability. Implement consistent hashing to partition data efficiently, using virtual nodes to distribute load uniformly and prevent hotspots. Define peer-to-peer replication strategies to achieve durability across multiple storage nodes.
Add scalability
Scalability requires distributing data across multiple storage nodes. As demand changes, we must dynamically add or remove nodes. To achieve this, we partition data to balance the load across the system.
A traditional partitioning method uses the modulus operator. For a system with 4 nodes, we want 25% of requests to go to each node. When a request arrives, we hash its key and compute the remainder modulo m. The result x (calculated as hash % m) determines which node processes the request.
The following slides explain this process:
We want to scale infrastructure with minimal disruption. However, modular hashing is inefficient for dynamic scaling. Adding or removing a node changes the divisor m, which alters the mapping for nearly all keys.
For example, if node 2 is removed, a key previously mapped to it might shift to node 1 because
Next, we will examine how to distribute data efficiently.
Consistent hashing
Consistent hashing manages load effectively by minimizing data movement during scaling. We visualize the hash space as a ring with values from
Each node ID is hashed to assign it a position on the hash ring. Request keys are hashed using the same function to determine their position on the ring. The request is routed to the first node encountered when traversing clockwise from the key’s position on the ring.
When a new node joins the ring, it takes over a portion of the keys from its immediate successor. Other nodes remain unaffected. This ensures that only a small subset of keys moves, making scaling efficient. Since hashes are randomly distributed, the load is generally expected to be even on average.
The primary benefit of consistent hashing is that adding or removing nodes requires moving only a minimal number of keys. However, in practice, random distribution does not guarantee equal load. A server handling a large segment of the ring may receive a disproportionate share of storage and retrieval requests. This creates a hotspot, which can bottleneck the entire system.
As shown in the figure below, if the segment between nodes N4 and N1 is large, N1 handles significantly more requests than other nodes. This non-uniform distribution degrades performance.
Note: It’s a good exercise to think of possible solutions to the non-uniform load distribution before reading on.
Use virtual nodes
To distribute load more evenly, we use virtual nodes. Instead of mapping a physical node to a single point on the ring, we map it to multiple points using different hash functions.
For example, if we use three hash functions, each physical server appears at three distinct positions on the ring. When a request lands on the ring, it is processed by the next virtual node found clockwise, which maps back to a physical server. This interleaving makes the load distribution more uniform. Additionally, if a node has higher hardware capacity, we can assign it more virtual nodes, allowing it to serve a larger portion of requests.
Advantages of virtual nodes
Using virtual nodes offers several benefits:
Fault tolerance: If a node fails or undergoes maintenance, its workload is spread uniformly across multiple other nodes rather than overwhelming a single neighbor.
Capacity management: We can adjust the number of virtual nodes based on the
of the physical infrastructure. A server with double the computational capacity can handle more virtual nodes and take on more load.heterogeneity The system needs to exploit heterogeneity in its infrastructure. e.g., the work distribution must be proportional to the capabilities of the individual servers. It is essential to add new nodes with higher capacity without upgrading all hosts.
We have now made the key-value storage scalable. The next step is ensuring high availability.
Data replication
To ensure durability and availability, we replicate data across multiple nodes. Common strategies include primary-secondary or peer-to-peer replication.
Primary-secondary approach
In a primary-secondary architecture, one node (primary) handles write requests while others (secondaries) replicate data and serve read requests. This introduces replication lag. Furthermore, if the primary fails, the system cannot accept writes until a new primary is elected, creating a single point of failure for write availability.
Peer-to-peer approach
In a peer-to-peer approach, all nodes act as primaries. Any node can handle both read and write requests, replicating data to peers to stay updated. Replicating to all nodes is inefficient and costly; typically, we choose a replication factor of three or five nodes.
We’ll use a peer-to-peer relationship for replication. We’ll replicate the data on multiple hosts to achieve durability and high availability. Each data item will be replicated at
Each node will replicate its data to the other nodes. We’ll call a node coordinator that handles read or write operations. It’s directly responsible for the keys. A coordinator node is assigned the key “K.” It’s also responsible for replicating the keys to
In the illustration below, the replication factor
What is the impact of synchronous or asynchronous replication?
In the context of the CAP theorem, key-value stores often prioritize Availability over Consistency (AP). If a network partition occurs, nodes continue to accept requests even if they cannot communicate with replicas. This ensures the system remains operational but may lead to temporary data inconsistencies.
When the connection is restored, nodes must sync data to resolve these conflicts. In the next lesson, we will explore how to handle these inconsistencies using data versioning.