Spanner using TrueTime

Let's examine how Spanner provides the consistency guarantees.

Spanner makes use of a novel API to record time, called TrueTimeE. Brewer, “Spanner, TrueTime and the CAP Theorem,” 2017., which is the key enabler for most of the consistency guarantees provided by Spanner.

TrueTime API

TrueTime API directly exposes clock uncertainty, and nodes can wait out that uncertainty when comparing timestamps retrieved from different clocks. If the uncertainty gets large because of some failure, this will manifest as increased latency due to nodes having to wait long periods.

TrueTime represents time as a TTInterval, which is an interval [earliest,latest][earliest, latest] with bounded time uncertainty. TrueTime API provides a method TT.now() that returns a TTInterval that is guaranteed to contain the absolute time during which the method was invoked.

Note: As previously explained in the chapter about time, this is an assumption that there’s an idealized absolute time that uses the Earth as a single frame of reference and is generated using multiple atomic clocks.

TrueTime also provides two convenience methods TT.after(t), TT.before(t) that specify whether t is definitely in the past or in the future. These are essentially just wrappers around TT.now(), since TT.after(t) == t << TT.now().earliest and TT.before(t) == t >> TT.now().latest. As a result, Spanner can assign timestamps to transactions that have global meaning and can be compared by nodes having different clocks.

Implementation of TrueTime

TrueTime is implemented by a set of time manager machines per datacenter and a time follower daemon per machine.

Note: In the original research paper describing Spanner, the authors use “time master machine” and “timeslave machine.” We will use the terms “time manager machine” and “time follower daemon” to refer to the same things.

Time manager machines

The managers can use one of two different forms of time reference, either GPS or atomic clocks, since they have different failure modes. The manager servers compare their time references periodically. They also cross-check the rate at which their reference time advances against their local clock, evicting themselves from the cluster if there is a significant divergence.

Time follower daemon

Daemons poll a variety of managers to synchronize their local clocks and advertise an uncertainty ee which corresponds to half of the interval’s width (latestearliest)/2(latest - earliest) / 2.

This uncertainty depends on manager-daemon communication latency and the uncertainty of the managers’ time. This uncertainty is a sawtooth function of time that is slowly increasing between synchronization.

In Google’s production environment, the average value of this uncertainty was reported to be just four milliseconds.

Get hands-on with 1200+ tech skills courses.