🤖 AI Summary
Under asynchronous network partitions, the CAP theorem imposes a fundamental trade-off between consistency and availability. This paper challenges the conventional binary assumption—requiring either strong consistency or global availability—by proposing the “partial progress” conjecture: permitting responsive service and non-zero throughput for a subset of clients during partitions.
Method: We formally define “partial progress” and prove its theoretical feasibility in the asynchronous model. We design CASSANDRA, a novel consensus protocol that leverages causal ordering and vector clocks to enable locally ordered replica processing without global synchronization, augmented by a lightweight coordination mechanism to ensure local consistency.
Results: Experimental evaluation under simulated partition scenarios demonstrates that CASSANDRA sustains service availability for over 85% of clients, achieves linear throughput scaling, and significantly outperforms mainstream protocols such as Paxos and Raft.
📝 Abstract
Each application developer desires to provide its users with consistent results and an always-available system despite failures. Boldly, the CALM theorem disagrees. It states that it is hard to design a system that is both consistent and available under network partitions; select at most two out of these three properties. One possible solution is to design coordination-free monotonic applications. However, a majority of real-world applications require coordination. We resolve this dilemma by conjecturing that partial progress is possible under network partitions. This partial progress ensures the system appears responsive to a subset of clients and achieves non-zero throughput during failures. To this extent, we present the design of our CASSANDRA consensus protocol that allows partitioned replicas to order client requests.