Distributed Transaction
Explore the concept of distributed transactions involving multiple database nodes and learn the two-phase commit protocol to ensure atomic commits or aborts. Understand the roles of coordinators and participants, and examine failure scenarios and limitations in distributed transaction management.
Introduction
The storage engine and WAL handle the atomicity of a database transaction in a single node, as described earlier.
A distributed transaction is a transaction that spawns over multiple database nodes. It is inherently complex because all nodes must commit together or abort. A subset of nodes committing and the rest aborting will lead to inconsistent results.
In this section, we will discuss the two-phase commit, which is a popular distributed transaction technique.
Two-phase commit
A two-phase commit (2PC) is a protocol for achieving an atomic distributed commit across multiple nodes, such that all nodes commit together or abort together. The protocol ensures that at no time does a subset of nodes commit and the rest abort.
These are the components involved in a 2PC:
Coordinator: 2PC coordinates transactions across multiple nodes through a coordinator (or transaction manager). The coordinator is either a library or a process running on the client to orchestrate the distributed transaction.
Participant: The database nodes involved in the transaction are called participants.
Workflow
A 2PC transaction begins with the coordinator generating a globally unique transaction identifier. Then, the coordinator ...