🤖 AI Summary
This paper addresses the efficient conversion of maximum distance separable (MDS) codes under dynamic parameter merging in large-scale distributed storage systems. We propose three constructions of convertible MDS codes achieving optimal access cost. Methodologically, we integrate an enhanced piggybacking framework, finite-field algebraic constructions, and systematic extensions of prior work—achieving, for nearly all parameter regimes, the first construction approaching the field-size lower bound implied by the MDS conjecture. Our key contributions are: (1) enabling minimal-access and minimal-bandwidth conversion from any initial MDS code to any target MDS code; (2) ensuring field size scales linearly with the final code length, drastically reducing reconfiguration overhead; and (3) jointly optimizing sub-packetization level and repair bandwidth, thereby supporting flexible, dynamic adaptation of redundancy and system scale.
📝 Abstract
In large-scale distributed storage systems, erasure coding is employed to ensure reliability against disk failures. Recent work by Kadekodi et al. demonstrates that adapting code parameters to varying disk failure rates can lead to significant storage savings without compromising reliability. Such adaptations, known as emph{code conversions}, motivate the design of emph{convertible codes}, which enable efficient transformations between codes of different parameters.
In this work, we study the setting in which $λ$ codewords of an initial $[n^I = k^I + r^I,, k^I]$ MDS code are merged into a single codeword of a final $[n^F = λk^I + r^F,, k^F = λk^I]$ MDS code. We begin by presenting three constructions that achieve optimal emph{access cost}, defined as the total number of disks accessed during the conversion process. The first two constructions apply when $λleq r^I$ and impose specific divisibility conditions on $r^I$ and the field size $q$. These schemes minimize both the per-symbol and the overall access cost. The third construction, which builds on a prior scheme by Kong, achieves minimal access cost while supporting arbitrary parameter regimes. All three constructions require field sizes that are linear in the final code length, and notably, the third construction achieves a field size that matches the lower bound implied by the MDS conjecture in almost all cases. In addition, we propose a construction that optimizes the emph{bandwidth cost}, defined as the total number of symbols transmitted during conversion. This scheme is a refinement of Maturana and Rashmi's bandwidth-optimal construction based on the piggybacking framework, and achieves reduced sub-packetization.