Constrained Error-Correcting Codes for Efficient DNA Synthesis

📅 2025-04-14

📈 Citations: 0

✨ Influential: 0

career value

250K/year

🤖 AI Summary

DNA parallel synthesis faces high costs due to stringent requirements of a fixed supersequence and simultaneous constraints—namely, an *l*-runlength limit (no homopolymer runs exceeding length *l*) and *ε*-balance (GC-content deviation ≤ *ε*). Existing coding schemes suffer from fundamental trade-offs between code rate and error-correction capability. Method: We propose the first capacity-optimal coding scheme that jointly satisfies both constraints strictly while supporting single-insertion/deletion correction. Our approach integrates combinatorial construction, finite-state machine (FSM) encoding, and synchronized decoding, achieving *O*(*n*)-time encoding and decoding. Contribution/Results: This is the first scheme to achieve the theoretical channel capacity under these dual constraints. It guarantees 100% constraint satisfaction and 100% single-edit correction success rate—surpassing prior art in both rate and robustness. Experimental results demonstrate substantial improvements in synthesis efficiency and data reliability for DNA-based storage.

Technology Category

Application Category

📝 Abstract

DNA synthesis is considered as one of the most expensive components in current DNA storage systems. In this paper, focusing on a common synthesis machine, which generates multiple DNA strands in parallel following a fixed supersequence,we propose constrained codes with polynomial-time encoding and decoding algorithms. Compared to the existing works, our codes simultaneously satisfy both l-runlength limited and {epsilon}-balanced constraints. By enumerating all valid sequences, our codes achieve the maximum rate, matching the capacity. Additionally, we design constrained error-correcting codes capable of correcting one insertion or deletion in the obtained DNA sequence while still adhering to the constraints.

Problem

Research questions and friction points this paper is trying to address.

Reduce DNA synthesis cost in storage systems

Design constrained codes for parallel DNA synthesis

Correct insertion/deletion errors in DNA sequences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Polynomial-time encoding and decoding algorithms

l-runlength limited and ε-balanced constraints

Error-correcting codes for insertion/deletion correction

🔎 Similar Papers

$mathsf{GC+}$ Code: a Short Systematic Code for Correcting Random Edit Errors in DNA Storage