Accelerating Data Access for Single Node in Distributed Storage Systems via MDS Codes

📅 2025-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high single-node read latency of Maximum Distance Separable (MDS) array codes in distributed storage systems, this paper presents the first systematic modeling and optimization of single-node access latency. We propose two low-latency read algorithms tailored to distinct data access patterns: one for uniform access distribution and another for shifted exponential distribution. Theoretically, we prove that the expected latency reduction ratios are $(n-k)(n-k+1)/[n(n+1)]$ and $(n-k)/n$, respectively. For a typical $(3,2)$ code configuration, worst-case latency is reduced by over 60%. Our approach combines rigorous probabilistic modeling, analytical derivation, and Monte Carlo simulation, demonstrating substantial performance gains over baseline schemes. The proposed algorithms achieve both theoretical soundness—supported by closed-form latency bounds—and practical deployability, offering a principled solution to latency bottlenecks in MDS-coded storage systems.

Technology Category

Application Category

📝 Abstract
Maximum distance separable (MDS) array codes are widely employed in modern distributed storage systems to provide high data reliability with small storage overhead. Compared with the data access latency of the entire file, the data access latency of a single node in a distributed storage system is equally important. In this paper, we propose two algorithms to effectively reduce the data access latency on a single node in different scenarios for MDS codes. We show theoretically that our algorithms have an expected reduction ratio of $frac{(n-k)(n-k+1)}{n(n+1)}$ and $frac{n-k}{n}$ for the data access latency of a single node when it obeys uniform distribution and shifted-exponential distribution, respectively, where $n$ and $k$ are the numbers of all nodes and the number of data nodes respectively. In the worst-case analysis, we show that our algorithms have a reduction ratio of more than $60%$ when $(n,k)=(3,2)$. Furthermore, in simulation experiments, we use the Monte Carlo simulation algorithm to demonstrate less data access latency compared with the baseline algorithm.
Problem

Research questions and friction points this paper is trying to address.

Distributed Storage Systems
MDS Array Codes
Read Latency Reduction
Innovation

Methods, ideas, or system contributions that make the work stand out.

MDS array codes
Reduced read latency
Monte Carlo simulation
🔎 Similar Papers
No similar papers found.