$mathsf{GC+}$ Code: a Short Systematic Code for Correcting Random Edit Errors in DNA Storage

📅 2024-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
DNA storage suffers from unreliable data recovery due to synthesis constraints on short sequences and stochastic edit errors (substitutions, insertions, deletions). Method: This paper proposes a novel error-correcting code featuring short codeword length (≤100 nt), systematic structure, and efficient encoding/decoding. It jointly optimizes GC-content constraints, stochastic edit-channel modeling, sparse parity-check matrix construction, and a lightweight iterative decoding algorithm—achieving systematicity and low computational overhead simultaneously for the first time. Contribution/Results: Theoretical analysis confirms feasibility for ultra-short codewords; simulations demonstrate >99.5% data recovery rate under realistic edit error rates (1%–5%). End-to-end experiments show significant error-correction improvement over state-of-the-art short codes, marking the first practical breakthrough for edit-correcting codes in real-world DNA storage systems.

Technology Category

Application Category

📝 Abstract
Storing digital data in synthetic DNA faces challenges in ensuring data reliability in the presence of edit errors -- deletions, insertions, and substitutions -- that occur randomly during various phases of the storage process. Current limitations in DNA synthesis technology also require the use of short DNA sequences, highlighting the particular need for short edit-correcting codes. Motivated by these factors, we introduce a systematic code designed to correct random edits while adhering to typical length constraints in DNA storage. We evaluate the performance of the code through simulations and assess its effectiveness within a DNA storage framework, revealing promising results.
Problem

Research questions and friction points this paper is trying to address.

Correcting random edit errors in DNA storage
Addressing short sequence constraints in DNA synthesis
Designing systematic codes for DNA data reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic code for random edit errors
Designed for short DNA sequence constraints
Theoretical and simulation performance evaluation
🔎 Similar Papers
No similar papers found.