Achieving Configurablility [merged with versioning]

Learn how the key-value storage node is made into a configurable service.

We'll cover the following

One of our functional requirements is that the system should be configurable. We want to control the tradeoffs between availability, consistency, cost-effectiveness, and performance. So let’s achieve Configurablility by implementing the basic functions, get and put, of the key-value store.

get and put operations

Every node can handle a get(read) and put(write) operations in our system. A node handling a read or write operation is known as coordinator. The coordinator is the first among the top n nodes in the preference list.

We have two strategies for a client to select a node. The first is to route the request to a generic load balancer, and the second is to use a partition-aware client library that routes requests directly to the appropriate coordinator nodes. Both approaches have their benefits. The client is not linked to the code in the first approach, whereas lower latency is achievable in the second.

Let’s work on making our service configurable by having an ability where we can control the tradeoffs between availability, consistency, cost-effectiveness, and performance. We can use a consistency protocol similar to those used in quorum systemsA quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system. Quorum-based voting in commit protocols and Quorum-based voting for replica control are two techniques used in distributed systems Source: Wikipedia. Let’s take an example. Say n in the top n of the preference list is equal to 3. It means three copies of the data need to be maintained. One on a node where the write operation is performed and the next two nodes while moving in the clockwise direction.

Using r and w

Now consider two variables, r and w. The r means the minimum number of nodes that need to be part of a successful read operation while w is the minimum number of nodes involved in a successful write operation. So if r=2, it means our system will read from two nodes even if we have data stored in three nodes. We have a quorum-like system by setting r + w > n.

Let’s say n = 3 which means we have three nodes where the data is copied to. Now for w = 2, the operation will make sure to write in two nodes to make this request successful. For the third node, the data will be updated asynchronously.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy