๐ค AI Summary
To address high update latency and low throughput in erasure-coded storage systems, this paper proposes a two-phase logged-update mechanism: lightweight data logging during the synchronous phase and real-time garbage collection leveraging spatiotemporal locality during the asynchronous phase. We innovatively design a three-tier logging structure and an SSD-lifetime-aware sequential I/O scheduler, converting random writes into sequential writes to significantly reduce update overhead. This is the first work to decouple and optimize the erasure-code update path. Evaluated on real-world traces from Alibaba Cloud and Tencent Cloud, our approach improves update throughput by 7.6ร and 5.0ร, respectively, while reducing SSD read, write, and erase countsโextending SSD lifetime by up to 13ร.
๐ Abstract
Compared to replication-based storage systems, erasure-coded storage incurs significantly higher overhead during data updates. To address this issue, various parity logging methods have been pro- posed. Nevertheless, due to the long update path and substantial amount of random I/O involved in erasure code update processes, the resulting long latency and low throughput often fail to meet the requirements of high performance applications. To this end, we propose a two-stage data update method called TSUE. TSUE divides the update process into a synchronous stage that records updates in a data log, and an asynchronous stage that recycles the log in real-time. TSUE effectively reduces update latency by transforming random I/O into sequential I/O, and it significantly reduces recycle overhead by utilizing a three-layer log and the spatio-temporal locality of access patterns. In SSDs cluster, TSUE significantly im- proves update performance, achieving improvements of 7.6X under Ali-Cloud trace, 5X under Ten-Cloud trace, while it also extends the SSD's lifespan by up to 13X through reducing the frequencies of reads/writes and of erase operations.