...

Quorums in Distributed Systems

Look at the concept of quorums and see how they solve low availability problems in synchronous replication.

We'll cover the following...

The problem in synchronous replication
Possible solution
- Quorums

The main pattern we’ve seen so far is this: writes are performed to all the replica nodes, while reads are performed to one of them. When we ensure that writes are performed to all of them synchronously before replying to the client, we guarantee that the subsequent reads see all the previous writes—regardless of the node that processes the read operation.

Note that, in the above paragraph, we have used the term “performed”. While one node receives a write request in either of the replication algorithms discussed earlier, the data is updated on all the nodes as a result of this request. Similarly, when a node receives a read request, it reads it from its local storage rather than performing a read on all the nodes. In the case of multi-primary replication, reads may be performed on all the nodes to handle write conflicts, but that is one possible solution and can’t be generalized as a pattern.

The problem in synchronous replication

Availability is quite low for write operations, because the failure of a single node makes the system unable to process writes until the node recovers.

Possible solution

To solve this problem, we can use the reverse strategy. That is, we write data only to the node that is responsible for processing a write operation, but process read operations by reading from all the nodes and returning the latest value.

This increases the availability of writes significantly but decreases the availability of reads at the same time. So, we have a trade-off that needs a mechanism to achieve a balance. Let’s see that mechanism.

Quorums

A useful mechanism to achieve a balance in this trade-off is to use quorums.

Let’s consider an example. In a system of three replicas, we can say that writes need to complete in two nodes (as a quorum of two), while reads need to retrieve data from two ...

Before Getting Started

Introduction to Distributed Systems

Basic Concepts and Theorems

Distributed Transactions

Achieving Isolation

Achieving Atomicity

Concluding Distributed Transactions

Consensus

Time

Order

Networking

Security

Security Protocols

From Theory to Practice

Case Study 1: Distributed File Systems

Case Study 2: Distributed Coordination Service

Case Study 3: Distributed Data Stores

Case Study 4: Distributed Messaging System

Case Study 5: Distributed Cluster Management

Case Study 6: Distributed Ledger

Case Study 7: Distributed Data Processing Systems

Practices & Patterns

Communication Patterns

Coordination Patterns

Data Synchronization

Shared-nothing Architectures

Distributed Locking

Compatibility Patterns

Dealing with Failure

Distributed Tracing

Concluding this Course

Quorums in Distributed Systems

The problem in synchronous replication

Possible solution

Quorums