🤖 AI Summary
To address the limitations of MDS coding in elastic computing—including poor straggler tolerance, high upload overhead, and excessive storage redundancy—this paper proposes the Lagrange Coded Storage with Uncoded Download (LCSUD) framework. LCSUD innovatively deploys Lagrange interpolation coding at the storage layer while integrating an uncoded data download mechanism, enabling coordinated elastic scheduling, dynamic node scaling, and real-time straggler recovery. Compared to state-of-the-art approaches, LCSUD reduces storage overhead by approximately 40%, lowers encoding complexity to *O*(*n* log *n*), cuts upload bandwidth by over 50%, and eliminates redundant transmission and decoding latency. Experimental evaluation on cloud-based matrix multiplication tasks demonstrates superior efficiency and scalability.
📝 Abstract
Coded elastic computing, introduced by Yang et al. in 2018, is a technique designed to mitigate the impact of elasticity in cloud computing systems, where machines can be preempted or be added during computing rounds. This approach utilizes maximum distance separable (MDS) coding for both storage and download in matrix-matrix multiplications. The proposed scheme is unable to tolerate stragglers and has high encoding complexity and upload cost. In 2023, we addressed these limitations by employing uncoded storage and Lagrange-coded download. However, it results in a large storage size. To address the challenges of storage size and upload cost, in this paper, we focus on Lagrange-coded elastic computing based on uncoded download. We propose a new class of elastic computing schemes, using Lagrange-coded storage with uncoded download (LCSUD). Our proposed schemes address both elasticity and straggler challenges while achieving lower storage size, reduced encoding complexity, and upload cost compared to existing methods.