wiki · home

Table of Contents

Distributed Transactions

Two-phase commit protocol

In the two-phase commit protocol (2PC) there is a Coordinator which as the name suggests takes care of coordinating the protocol with the steps below:

  1. Contact every participant telling them to get ready for the transaction.
  2. Gather the response from all participants and based on the response decide whether or not to commit. If all participants voted YES then commit, if not, abort.
                               ┌────┐                                        ┌────┐
                     ┌────────▶│ A  │                              ┌─────────│ A  │
                     │         └────┘                              │         └────┘
                  propose                                       yes/no
                     │                                             │
┌───────────┐────────┘         ┌────┐         ┌───────────┐◀───────┘         ┌────┐
│Coordinator│─────propose─────▶│ B  │         │Coordinator│◀────yes/no───────│ B  │
└───────────┘────────┐         └────┘         └───────────┘◀───────┐         └────┘
                     │                                             │
                  propose                                       yes/no
                     │         ┌────┐                              │         ┌────┐
                     └────────▶│ C  │                              └─────────│ C  │
                               └────┘                                        └────┘




                               ┌────┐                                        ┌────┐
                     ┌────────▶│ A  │                              ┌─────────│ A  │
                     │         └────┘                              │         └────┘
               commit/abort                                       ack
                     │                                             │
┌───────────┐────────┘         ┌────┐         ┌───────────┐◀───────┘         ┌────┐
│Coordinator│──commit/abort───▶│ B  │         │Coordinator│◀──────ack────────│ B  │
└───────────┘────────┐         └────┘         └───────────┘◀───────┐         └────┘
                     │                                             │
               commit/abort                                       ack
                     │         ┌────┐                              │         ┌────┐
                     └────────▶│ C  │                              └─────────│ C  │
                               └────┘                                        └────┘

Problems with 2PC

2PC is a blocking protocol which means that if some entity crashes the others will wait for some time until the crashed entity comes back.

Consider the case when the propose messages have been sent but not all of them and the coordinator died during the process. Some participants received the proposal and are waiting for the next step, if the coordinator doesn’t come back or if it takes a long time to come up these participants will be blocked while other servers will be there thinking that everything is fine. If all participants received the proposal and are blocked after the coordinator crashed they can’t simply discard their locks on resources that were acquired after the propose message because the coordinator could have come back saw their yes and died again or is just taking too much time to reply with commit.

One way to mitigate this problem is by having a secondary process that can take over the job of coordinating the protocol if the coordinator stops working. For that the participants must save what was their response to some round of agreement to remember what was their reply to the original coordinator. The coordinator could save the results of completed rounds on durable storage so when it comes back up it can inform the participants the result and they can take action cleaning up their logs.

Even with a standby coordinator there might be problems if one of the participants stops working. The new coordinator doesn’t know what the failed participant voted and cannot decide if the round should be commit or abort based on the responses from the surviving participants.

Three-phase commit protocol

The three-phase commit protocol (3PC) adds a new step to 2PC in order to not block during the process and in case an entity does not receive some message for a certain period of time it can take action without waiting forever. The steps for the coordinator are the following:

  1. Contact every participant telling them to get ready for the transaction.
  2. Gather the response from all participants and based on the response decide whether or not to prepare for commit.
  3. Send commit/abort based on the response from their participants.
                               ┌────┐                                        ┌────┐
                     ┌────────▶│ A  │                              ┌─────────│ A  │
                     │         └────┘                              │         └────┘
                  propose                                       yes/no
                     │                                             │
┌───────────┐────────┘         ┌────┐         ┌───────────┐◀───────┘         ┌────┐
│Coordinator│─────propose─────▶│ B  │         │Coordinator│◀────yes/no───────│ B  │
└───────────┘────────┐         └────┘         └───────────┘◀───────┐         └────┘
                     │                                             │
                  propose                                       yes/no
                     │         ┌────┐                              │         ┌────┐
                     └────────▶│ C  │                              └─────────│ C  │
                               └────┘                                        └────┘




                               ┌────┐                                        ┌────┐
                     ┌────────▶│ A  │                              ┌─────────│ A  │
                     │         └────┘                              │         └────┘
                  prepare                                         ack
                     │                                             │
┌───────────┐────────┘         ┌────┐         ┌───────────┐◀───────┘         ┌────┐
│Coordinator│─────prepare─────▶│ B  │         │Coordinator│◀──────ack────────│ B  │
└───────────┘────────┐         └────┘         └───────────┘◀───────┐         └────┘
                     │                                             │
                  prepare                                         ack
                     │         ┌────┐                              │         ┌────┐
                     └────────▶│ C  │                              └─────────│ C  │
                               └────┘                                        └────┘




                               ┌────┐                                        ┌────┐
                     ┌────────▶│ A  │                              ┌─────────│ A  │
                     │         └────┘                              │         └────┘
               commit/abort                                       ack
                     │                                             │
┌───────────┐────────┘         ┌────┐         ┌───────────┐◀───────┘         ┌────┐
│Coordinator│──commit/abort───▶│ B  │         │Coordinator│◀──────ack────────│ B  │
└───────────┘────────┐         └────┘         └───────────┘◀───────┐         └────┘
                     │                                             │
               commit/abort                                       ack
                     │         ┌────┐                              │         ┌────┐
                     └────────▶│ C  │                              └─────────│ C  │
                               └────┘                                        └────┘

In 3PC if there is a timeout when the coordinator is waiting for the yes/no from the participants the coordinator decides to abort the transaction. If the timeout from a participant happens after the coordinator sent the prepare messages it should commit. In this case the participants know they received the prepare and voted yes so when a participant recovers it will know that the round should have committed.

If the coordinator dies after sending the prepare message the standby coordinator can contact the participants and ask if they received the prepare message and if it happened then it is safe to send commit because the participants can only receive prepare messages if all participants agreed to proceed. If some participant didn’t receive prepare but others did it is also safe to commit. Maybe the message got lost or didn’t arrive yet. In case all the participants didn’t receive the prepare messages the new coordinator can safely abort.

The new step helps the protocol by giving the participants the ability to decide to abort in case they don’t receive the prepare message after a certain time instead of blocking indefinitely waiting for the coordinator instruction to commit or abort. This step also helps in case the participants fail after the prepare messages were sent, assuming these nodes will recover, the protocol can commit because the participants said they were prepared.

Problems with 3PC

In face of a network partition where all nodes at one side (side A) received the prepare message and the nodes on the other side (side B) didn’t receive it the protocol will break since each side will take different actions: A will commit and B will abort.

References

Carlos Galdino
@carlosgaldino
github.com/carlosgaldino
blog.carlosgaldino.com