From Indexing to Coding: A New Paradigm for Data Availability Sampling

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Blockchain data availability (DA) is a critical bottleneck limiting lightweight node verification efficiency and system scalability. Existing data availability sampling (DAS) schemes rely on fixed-rate erasure coding to generate commitments, restricting sampling to a pre-encoded symbol set and resulting in inflexible redundancy and insufficient security. This paper introduces a novel “commitment–encoding decoupling” paradigm: cryptographic commitments are first computed directly over raw data; random linear network coding (RLNC) is then performed dynamically during sampling, enabling on-demand, real-time generation of encoded symbols. This design eliminates rigid redundancy constraints, substantially enhancing sampling flexibility and information-theoretic security. Experimental results demonstrate that, compared to conventional DAS, our approach provides lightweight nodes with up to three orders of magnitude higher data availability assurance, while simultaneously improving system robustness and scalability.

Technology Category

Application Category

📝 Abstract
The data availability problem is a central challenge in blockchain systems and lies at the core of the accessibility and scalability issues faced by platforms such as Ethereum. Modern solutions employ several approaches, with data availability sampling (DAS) being the most self-sufficient and minimalistic in its security assumptions. Existing DAS methods typically form cryptographic commitments on codewords of fixed-rate erasure codes, which restrict light nodes to sampling from a predetermined set of coded symbols. In this paper, we introduce a new approach to DAS that modularizes the coding and commitment process by committing to the uncoded data while performing sampling through on-the-fly coding. The resulting samples are significantly more expressive, enabling light nodes to obtain, in concrete implementations, up to multiple orders of magnitude stronger assurances of data availability than from sampling pre-committed symbols from a fixed-rate redundancy code as done in established DAS schemes using Reed Solomon or low density parity check codes. We present a concrete protocol that realizes this paradigm using random linear network coding (RLNC).
Problem

Research questions and friction points this paper is trying to address.

Improving data availability in blockchain systems
Overcoming limitations of fixed-rate erasure coding
Enabling stronger data assurance through dynamic coding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modularizes coding and commitment processes
Performs sampling through on-the-fly coding
Uses random linear network coding protocol
🔎 Similar Papers
No similar papers found.
M
Moritz Grundei
Optimum, Cambridge, MA, USA
A
Aayush Rajasekaran
Optimum, Cambridge, MA, USA
K
Kishori Konwar
Optimum, Cambridge, MA, USA
Muriel Medard
Muriel Medard
Professor of EECS, MIT
Information theorycommunicationsnetworkswirelessnetwork coding