Working of the Two-Phase Commit Protocol

Learn how the two-phase commit protocol provides atomicity.

Two-phase commit (2PC) has two phases—prepare and commit. Intuitively, the prepare phase asks everyone: can you do X? If everyone says yes, then the commit phase asks them to do X. If anyone says no, X can not be done at that moment. In many databases, the required data is shipped to specific participants before starting a 2PC transaction. A unique identifier is provided for such data, which is later used in 2PC phases. Some participants start acquiring required locks in preparation for an impending 2PC, while others acquire locks in the preparation phase. Both of these strategies have their pros and cons. Latency can be reduced by acquiring locks early, though if a coordinator fails after sending data but before starting a 2PC transaction, data items will be unnecessarily locked, potentially reducing the availability of that data.

The prepare phase

The two-phase commit (2PC) process begins with the coordinator informing cohorts of the new transaction by sending a prepare message. Cohorts evaluate whether they can execute the portion of the transaction that pertains to them and make a decision to commit or abort. If a cohort determines that it can execute, it reports to the coordinator of its affirmative vote. If not, it informs the coordinator, who then aborts the transaction and lets the rest of the cohorts know. Cohorts typically lock the data they will be modifying during the prepare phase and hold on to those locks until the end of the commit phase. This ensures that conflicting transactions cannot modify the same data simultaneously. Once the cohorts promise to commit or abort, there is no way for them to back off from that decision in 2PC unilaterally. The coordinator records all decisions made by cohorts in its log and maintains a local copy of their vote to guard against failures during a 2PC transaction.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.