Real Life Is Uncertain. Consensus Should Be Too!

📅 2025-05-14

🏛️ USENIX Workshop on Hot Topics in Operating Systems

📈 Citations: 1

✨ Influential: 0

career value

251K/year

🤖 AI Summary

Traditional consensus protocols rely on deterministic f-threshold fault models that struggle to capture the complex failure behaviors observed in real-world systems, thereby limiting optimization of performance and cost. This work proposes a novel consensus mechanism grounded in a probabilistic fault model, which incorporates machine-level failure curves and abandons the rigid majority quorum constraint in favor of dynamic, non-traditional quorum strategies. By more accurately reflecting actual operating conditions, the proposed approach substantially enhances system reliability, efficiency, cost-effectiveness, and sustainability.

Technology Category

Application Category

📝 Abstract

Modern distributed systems rely on consensus protocols to build a fault-tolerant-core upon which they can build applications. Consensus protocols are correct under a specific failure model, where up to f machines can fail. We argue that this f -threshold failure model oversimplifies the real world and limits potential opportunities to optimize for cost or performance. We argue instead for a probabilistic failure model that captures the complex and nuanced nature of faults observed in practice. Probabilistic consensus protocols can explicitly leverage individual machine failure curves and explore side-stepping traditional bottlenecks such as majority quorum intersection, enabling systems that are more reliable, efficient, cost-effective, and sustainable.

Problem

Research questions and friction points this paper is trying to address.

consensus protocols

failure model

distributed systems

probabilistic failures

fault tolerance

Innovation

Methods, ideas, or system contributions that make the work stand out.

probabilistic consensus

failure curves

quorum intersection