DNA Storage in the Short Molecule Regime

📅 2025-11-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work characterizes the capacity of short-molecule DNA storage systems, providing the first rigorous proof of the conjecture by Shomorony and Heckel (2022) on the scaling behavior of information bits. Focusing on the short-molecule regime, we propose two novel coding schemes: one achieves the theoretically optimal scaling law, while the other attains near-optimal performance with significantly reduced computational complexity. Methodologically, we employ a random coding framework, quantifying random distributions over the probability simplex and analyzing performance via the optimal maximum-likelihood decoder to derive a tight achievability bound. This bound matches the state-of-the-art converse bound across the entire short-molecule regime—except for extremely short molecules—thereby establishing a tight capacity scaling law. Our results provide a foundational information-theoretic characterization for DNA-based data storage.

Technology Category

Application Category

📝 Abstract
We study the amount of reliable information that can be stored in a DNA-based storage system composed of short DNA molecules. In this regime, Shomorony and Heckel (2022) put forward a conjecture on the scaling of the number of information bits that can be reliably stored. In this paper, we complete the proof of this conjecture. We analyze a random-coding scheme in which each codeword is obtained by quantizing a randomly generated probability mass function drawn from the probability simplex. By analyzing the optimal maximum-likelihood decoder, we derive an achievability bound that matches a recently established converse bound across the entire short-molecule regime. We also propose a second coding scheme, which operates with significantly lower computational complexity but achieves the optimal scaling, except for a specific range of very short molecules.
Problem

Research questions and friction points this paper is trying to address.

Analyzing reliable information capacity in DNA storage with short molecules
Proving a conjecture about scaling of storable bits in DNA systems
Developing efficient coding schemes for DNA storage with optimal scaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

DNA storage system uses short molecule regime
Random-coding scheme with quantized probability mass function
Low-complexity coding achieves optimal scaling performance
🔎 Similar Papers
No similar papers found.
R
Ran Tamir
Department of Signal Theory and Communications, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain
Nir Weinberger
Nir Weinberger
Assistant Professor
Information theory
A
Albert Guillén i Fàbregas
Department of Engineering, University of Cambridge, CB2 1PZ Cambridge, UK, and with the Department of Signal Theory and Communications, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain