🤖 AI Summary
This work addresses the challenges of efficiently disseminating and storing erasure-coded data in distributed systems, where unpredictable network latency, complex recovery procedures, and reliance on centralized coordination hinder performance and scalability. To overcome these limitations, the paper proposes Rafture, a novel distribution algorithm that integrates the Raft consensus protocol with erasure coding. Rafture introduces a post-dissemination pruning mechanism to dynamically adjust storage overhead, employs a fixed-threshold erasure code combined with a two-dimensional chunk allocation strategy to simplify recovery without requiring additional metadata, and enables nodes to make autonomous decisions based solely on local information, thereby eliminating dependence on a centralized leader. Experimental results demonstrate that Rafture significantly reduces long-term storage costs while improving recovery efficiency in highly dynamic network environments.
📝 Abstract
Spreading and storing erasure-coded data in distributed systems effectively is challenging in real settings. Practical deployments must contend with unpredictable network latencies, particularly when information dispersal is integrated into consensus protocols, a prominent and latency-sensitive use case. Existing approaches address this challenge through timeout-based dissemination and adaptive communication or storage decisions driven by acknowledgments during dissemination. However, these designs focus almost exclusively on dissemination-time efficiency, complicate recovery with reconstruction procedures that require metadata that can differ per consensus value, and rely on a centralized leader to make storage decisions for all nodes.
This paper introduces \textbf{Rafture}, a novel information dispersal algorithm, and its integration in a consensus protocol, that overcomes these limitations. Rafture is the first information dispersal solution to incorporate post-dissemination pruning, allowing systems to adapt storage costs after dissemination completes. It employs a simple, fixed-threshold erasure code while varying distinct fragment assignment along a second dimension. This ensures that reconstruction is always possible from $F+1$ fragments using the same interpolation method and no additional metadata. Rafture further enables nodes to adapt autonomously based on locally observed information, eliminating the need for global coordination.
We evaluate Rafture in highly dynamic network settings and show that it simplifies recovery while significantly improving long-term storage consumption under variable network conditions.