wiki · home


Existential Consistency: Measuring and Understanding Consistency at Facebook

The paper analyzes requests made to Facebook’s TAO system to quantify how often anomalies regarding data consistency happen in practice. The system consists of a two-level cache and a sharded database with a single-master-per-shard. The following figure illustrate a typical request:

Within each cluster the system guarantees per-object sequential and read-after-write consistency. Across the entire system the system provides eventual consistency.

The results show that only 0.0004% of the requests returned a different value than what a linearizable system would return1.

Two forms of analysis are performed: a principled version and a practical version. The principled analysis identifies when the results would differ from a stronger consistency model and is not performed in real-time. The practical analysis is performed at real-time but not for all requests and is used as a monitored metric to help detecting bugs.

Consistency checker

Principled analysis

It’s interesting how they constructed their consistency checker for the principled analysis. They collect a bunch of information about the requests and then create a graph with the requests. Each write and read operations are vertices in the graph and based on the time they completed the edges are created connecting them. Edges are also created to connect subsequent read operations with the write operation they read, and the edges are created in both directions.

The trick is that the read operations are merged with the write operation they read and the edges are inherited by the write operation. So if the final graph has a cycle it is a sign that there is an anomaly. An acyclic graph means there is a total order over the operations.

The following figure illustrates a trace that shows an anomaly:

               ┌────┐          ┌────┐           ┌────┐
               │    │          │    │           │    │
before merge   │ w1 │─────────▶│ w2 │──────────▶│ r1 │
               │    │          │    │           │    │
               └────┘          └────┘           └────┘
                  │                                ▲
                  │                                │
                  └────────────────────────────────┘


               ┌────┐          ┌────┐
               │    │─────────▶│    │
after merge    │ w1 │          │ w2 │
               │    │◀─────────│    │
               └────┘          └────┘

The other weaker consistency models provided by TAO like per-object sequential consistency and read-after-write are detected using the same graph.

Causal consistency can’t be checked with this model. The following example, where C0, C1, C2 represent different clusters, is valid under causal consistency:

C0: Wx1
C1:    Rx1
C2:       Rx0

The following example is not:

C0: Wx1
C1:    Rx1 Wy2
C2:           Ry2 Rx0

In the example above if C2 saw the operation that caused y to change it should also see all operations that might have influenced the write to y. The trace for the y object is Wy2 Ry2, which is valid and alone does not give evidence about an illegal operation. The trace for the x object is the same in both examples: Wx1 Rx1 Rx0. Since Facebook only traces a tiny random sample of keys they might not see both y and x traces and won’t be able to perform a causal check since each trace alone is not enough for detecting such violations.

Practical analysis

Since the principled analysis is not a good fit for real-time monitoring they came up with a way for monitoring in real-time how often anomalies happened, called Φ(P)-consistency. The definition is as follows:

The Φ(P)-consistency of a set of replicas P is the frequency that injected reads for the same data to all pP receive the same response from each p.

The injected reads are issued from the same server and the responses are compared when all have returned. Because there is no clock synchronization, versioning, or logging involved, the responses might be affected by the network and give false negatives (the read is marked inconsistent when it should be consistent). For example, a server sends injected read operations to other replicas but if in a given replica a write operation arrives and is executed just before the injected read is received, when the read returns it will have a different value than the others that received the injected read before the write. This injected read will be marked as inconsistent even though that at the time the injected read was dispatched the replicas had a consistent view for the object.

Conclusion

At first glance you might expect that the TAO system would have higher rates of inconsistency but the results show the opposite2. Another interesting aspect is the idea of a metric to monitor these anomalies in real-time and give a hint about a possible bug that was put in production recently.

References

Notes


  1. The authors don’t mention multiple objects requests.↩︎

  2. Bear in mind that these results are Facebook-specific and might not be the same for different applications.↩︎