Complex DNA Synthesis Sequences

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
DNA data storage is constrained by the trade-off between synthesis flexibility and parallel scalability: enzymatic synthesis enables flexible single-strand fabrication but suffers from poor scalability, whereas photolithographic methods achieve massive parallelism yet lack sequence adaptability. This paper proposes a novel *constrained parallel synthesis* framework that unifies these paradigms via a per-cycle restriction on the admissible nucleotide subset, balancing flexibility and scalability. We introduce the concept of *complex synthesis sequences*, establishing a theoretical bridge between constrained synthesis and the ideal unrestricted case. A new two-dimensional array synthesis model is proposed to extend the fundamental limits of DNA synthesis. Leveraging dynamic programming for optimal sequence design, combined with information-rate analysis and asymptotic behavior of deletion balls, we derive a tight closed-form expression for the maximum achievable information rate and its asymptotic performance. This work provides the first unified theoretical framework and efficient algorithmic foundation for constrained DNA synthesis.

Technology Category

Application Category

📝 Abstract
DNA-based storage offers unprecedented density and durability, but its scalability is fundamentally limited by the efficiency of parallel strand synthesis. Existing methods either allow unconstrained nucleotide additions to individual strands, such as enzymatic synthesis, or enforce identical additions across many strands, such as photolithographic synthesis. We introduce and analyze a hybrid synthesis framework that generalizes both approaches: in each cycle, a nucleotide is selected from a restricted subset and incorporated in parallel. This model gives rise to a new notion of a complex synthesis sequence. Building on this framework, we extend the information rate definition of Lenz et al. and analyze an analog of the deletion ball, defined and studied in this setting, deriving tight expressions for the maximal information rate and its asymptotic behavior. These results bridge the theoretical gap between constrained models and the idealized setting in which every nucleotide is always available. For the case of known strands, we design a dynamic programming algorithm that computes an optimal complex synthesis sequence, highlighting structural similarities to the shortest common supersequence problem. We also define a distinct two-dimensional array model with synthesis constraints over the rows, which extends previous synthesis models in the literature and captures new structural limitations in large-scale strand arrays. Additionally, we develop a dynamic programming algorithm for this problem as well. Our results establish a new and comprehensive theoretical framework for constrained DNA, subsuming prior models and setting the stage for future advances in the field.
Problem

Research questions and friction points this paper is trying to address.

Overcoming scalability limitations in DNA-based storage synthesis efficiency
Developing a hybrid framework for constrained parallel nucleotide addition
Establishing theoretical foundations for optimal complex synthesis sequences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid synthesis framework with restricted nucleotide subsets
Dynamic programming algorithm for optimal synthesis sequences
Two-dimensional array model with synthesis constraints