🤖 AI Summary
Traditional consensus protocols rely on deterministic f-threshold fault models that struggle to capture the complex failure behaviors observed in real-world systems, thereby limiting optimization of performance and cost. This work proposes a novel consensus mechanism grounded in a probabilistic fault model, which incorporates machine-level failure curves and abandons the rigid majority quorum constraint in favor of dynamic, non-traditional quorum strategies. By more accurately reflecting actual operating conditions, the proposed approach substantially enhances system reliability, efficiency, cost-effectiveness, and sustainability.
📝 Abstract
Modern distributed systems rely on consensus protocols to build a fault-tolerant-core upon which they can build applications. Consensus protocols are correct under a specific failure model, where up to f machines can fail. We argue that this f -threshold failure model oversimplifies the real world and limits potential opportunities to optimize for cost or performance. We argue instead for a probabilistic failure model that captures the complex and nuanced nature of faults observed in practice. Probabilistic consensus protocols can explicitly leverage individual machine failure curves and explore side-stepping traditional bottlenecks such as majority quorum intersection, enabling systems that are more reliable, efficient, cost-effective, and sustainable.