CAP theorem

Consistency, Availability, Partition tolerance - Choose two

Published on Wednesday, November 18, 2020

CAP theorem (Brewer's theorem) states that it is impossible in the distributed data store to simultaneously provide more than two out of the three following:

  • Consistency - Every read request receives the most recent write or an error
  • Availability - Every request receives a non-error response, but the data does not have to be the most recent
  • Partition Tolerance - The system continues to operate despite an arbitrary number of messages being dropped by the network nodes

By the nature of the networks, no distributed system can be 100% reliable - thus network partitioning has to be tolerated. With that in mind, two other options are consistency or availability. If we choose availability over consistency, the system will return data, but there will be no guarantee that it is the latest data, because of the network partitioning. If we choose consistency over availability, the system will return a time out (or error) if data has no guarantee, that is the latest information because of the network partitioning.

When systems run normally (without network errors), availability and consistency can be satisfied, and no trade-off has to be made.

  • Database systems with ACID guarantees, like traditional relation databases, choose consistency over availability.
  • System designed with BASE principle (eventual consistency), choose availability over consistency.

History

In the late 90s, Eric Brewer (Berkley) published the theorem as CAP principle. In 2002 Seth Gilbert(MIT) and Nancy Lynch(MIT) published formal proof and making the principle a theorem. With the cloud, microservices, and ever-increasing number of distributed systems, CAP became an important point of discussion in designing any large system.

PACELC theorem extends CAP theorem by adding latency.