TSUE: A Two-Stage Data Update Method for an Erasure Coded Cluster File System

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

To address high update latency and low throughput in erasure-coded storage systems, this paper proposes a two-phase logged-update mechanism: lightweight data logging during the synchronous phase and real-time garbage collection leveraging spatiotemporal locality during the asynchronous phase. We innovatively design a three-tier logging structure and an SSD-lifetime-aware sequential I/O scheduler, converting random writes into sequential writes to significantly reduce update overhead. This is the first work to decouple and optimize the erasure-code update path. Evaluated on real-world traces from Alibaba Cloud and Tencent Cloud, our approach improves update throughput by 7.6× and 5.0×, respectively, while reducing SSD read, write, and erase counts—extending SSD lifetime by up to 13×.

Technology Category

Application Category

📝 Abstract

Compared to replication-based storage systems, erasure-coded storage incurs significantly higher overhead during data updates. To address this issue, various parity logging methods have been pro- posed. Nevertheless, due to the long update path and substantial amount of random I/O involved in erasure code update processes, the resulting long latency and low throughput often fail to meet the requirements of high performance applications. To this end, we propose a two-stage data update method called TSUE. TSUE divides the update process into a synchronous stage that records updates in a data log, and an asynchronous stage that recycles the log in real-time. TSUE effectively reduces update latency by transforming random I/O into sequential I/O, and it significantly reduces recycle overhead by utilizing a three-layer log and the spatio-temporal locality of access patterns. In SSDs cluster, TSUE significantly im- proves update performance, achieving improvements of 7.6X under Ali-Cloud trace, 5X under Ten-Cloud trace, while it also extends the SSD's lifespan by up to 13X through reducing the frequencies of reads/writes and of erase operations.

Problem

Research questions and friction points this paper is trying to address.

Reduces high overhead in erasure-coded storage updates

Transforms random I/O into sequential I/O for lower latency

Improves update performance and extends SSD lifespan significantly

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage update method reduces latency

Transforms random I/O into sequential I/O

Three-layer log leverages access locality

🔎 Similar Papers

No similar papers found.