Concurrent Writes

We'll cover the following

We have previously learnt how strict quorum reads and writes can be employed in leaderless replication and allow us to retrieve latest data values for the most part. However, using strict quorums doesn’t prevent concurrent writes for the same data value resulting in conflicts, which if not resolved can result in data loss.

Before we embark on learning how conflicts are resolved or avoided with quorum writes, we shall explain what is meant by a concurrent write. Generally, concurrent writes are understood to mean writes that happen at exactly the same time. In distributed systems, clocks can’t be perfectly synchronized on all the nodes and it becomes difficult to tell if two writes indeed occurred simultaneously.

Example

Consider a system that stores key/value pairs. Node A writes a pair (X, 7) to the system. Node B retrieves Node A’s write, increments the key’s value and writes back (X, 8). In this example there are two write events, one by Node A and one by Node B. We can say that Node A’s write happens before Node B’s write since Node B builds upon or depends upon Node A’s write. Since Node A’s write happens before Node B’s write the two events aren’t concurrent and Node B’s write is said to be causally dependent on Node A’s write.

In contrast, consider the situation where the two nodes attempt to update the same key after a few hours without knowing that the other node also intends to update the same key. In this case, the writes are concurrent because neither is aware of the other’s occurrence. Note, it isn’t necessary that the two writes overlap in time. Even if they do, it isn’t guaranteed that the writes reach the destined replicas at the same time. Write requests can get delayed because of transient network or node failures. If several clients update the same key, the system can end up with inconsistent data if sane conflict resolution isn’t applied. Consider the following sequence of events when Node A attempts to update key X with a value 17 and Node B attempts to update the key X with a value 39.

  1. Node A’s write reaches replica#1 and Node B’s write never makes it to replica#1 because of a network outage.

Get hands-on with 1200+ tech skills courses.