Detailed Design of Chubby: Part IV

Learn about fault-tolerance, scalability, and availability of Chubby's design.


Nodes can fail, and we need to have an excellent strategy to minimize downtime. One way to reduce downtime is a fast failover. Let’s discuss the failover scenario and how our system handles such cases.

A primary replica discards its state about sessions, handles, and locks when it fails or loses leadership. This must be followed by the election of a new primary replica with the following two possible options:

  1. If a primary replica gets elected rapidly, clients can connect with the new primary replica before their own approximation of the primary replica lease’s (local lease) duration runs out.
  2. If the election extends for a longer duration, clients discover the new primary replica by emptying their caches and waiting for the grace period. As a result, our system can keep sessions running during failovers that go past the typical lease time-out, thanks to the grace period.

Once a client has made contact with the new primary replica, the client library and the primary replica give this impression to the application that nothing went wrong by working together. The new primary replica must recreate the in-memory state of the primary replica that it replaced. It accomplishes this in part by:

  • Reading data that has been duplicated via the standard database replication technique and kept safely on disk
  • Acquiring state from clients
  • Making cautious assumptions

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.