Making Wide Stripes Practical: Cascaded Parity LRCs for Efficient Repair and High Reliability

📅 2025-12-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Local Reconstruction Codes (LRCs) in wide-stripe erasure coding suffer from structural limitations: enlarged local groups increase single-node repair overhead; frequent multi-node failures trigger costly global repairs; and reliability degrades sharply. Method: This paper proposes Cascaded Parity LRC (CP-LRC), the first LRC design to establish structured dependencies between local and global parity blocks. It decomposes global parity information and embeds it into local parity groups, forming a cascaded parity structure that enables coordinated local–global repair—while preserving MDS fault tolerance. CP-LRC is instantiated via a finite-field coefficient generation framework and cascade-aware repair algorithms, yielding two variants: CP-Azure and CP-Uniform. Contribution/Results: Real-world deployment on Alibaba Cloud demonstrates 41% and 26% reductions in repair time for single- and double-node failures, respectively, significantly improving both repair efficiency and system reliability.

Technology Category

Application Category

📝 Abstract
Erasure coding with wide stripes is increasingly adopted to reduce storage overhead in large-scale storage systems. However, existing Locally Repairable Codes (LRCs) exhibit structural limitations in this setting: inflated local groups increase single-node repair cost, multi-node failures frequently trigger expensive global repair, and reliability degrades sharply. We identify a key root cause: local and global parity blocks are designed independently, preventing them from cooperating during repair. We present Cascaded Parity LRCs (CP-LRCs), a new family of wide stripe LRCs that embed structured dependency between parity blocks by decomposing a global parity block across all local parity blocks. This creates a cascaded parity group that preserves MDS-level fault tolerance while enabling low-bandwidth single-node and multi-node repairs. We provide a general coefficient-generation framework, develop repair algorithms exploiting cascading, and instantiate the design with CP-Azure and CP-Uniform. Evaluations on Alibaba Cloud show reductions in repair time of up to 41% for single-node failures and 26% for two-node failures.
Problem

Research questions and friction points this paper is trying to address.

Reduces single-node repair cost in wide stripe LRCs
Minimizes expensive global repair for multi-node failures
Enhances reliability while preserving MDS fault tolerance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cascaded parity LRCs embed structured dependency between parity blocks
Decompose global parity across local parity blocks for cooperation
Enable low-bandwidth single-node and multi-node repairs with MDS-level fault tolerance
🔎 Similar Papers
No similar papers found.
F
Fan Yu
Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education, and the School of Cyber Science and Technology, Shandong University, Qingdao, Shandong 266237, China
G
Guodong Li
Key Laboratory of Cryptologic Technology and Information Security, Ministry of Education, and the School of Cyber Science and Technology, Shandong University, Qingdao, Shandong 266237, China
S
Si Wu
School of Computer Science and Technology, Shandong University, Qingdao, Shandong 266237, China
Weijun Fang
Weijun Fang
Shandong University
Coding Theory
Sihuang Hu
Sihuang Hu
Shandong University
Coding theoryCombinatoricsLattices