Programmable Co-Transcriptional Splicing: Realizing Regular Languages via Hairpin Deletion

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of designing minimal circular DNA templates that enable efficient co-transcriptional splicing to produce a specified set of target RNA sequences. We introduce a novel paradigm encoding nondeterministic finite automata (NFAs) into circular DNA templates—enabling, for the first time, programmable co-transcriptional synthesis of arbitrary finite RNA languages. Our method integrates co-transcriptional RNA folding and splicing using a logarithmic energy model and leverages hairpin-mediated self-cleavage to perform regular language recognition and sequence excision. We construct a universal DNA template framework capable of generating all RNA sequences accepted by a given NFA, overcoming limitations of prior small-scale template designs. We prove that NFA minimization is NP-hard and thus propose practical heuristic design strategies. By unifying formal language theory, molecular programming, and RNA structural modeling, this work establishes the first automaton-based, systematic DNA coding framework for programmable RNA synthesis.

Technology Category

Application Category

📝 Abstract
RNA co-transcriptionality, where RNA is spliced or folded during transcription from DNA templates, offers promising potential for molecular programming. It enables programmable folding of nano-scale RNA structures and has recently been shown to be Turing universal. While post-transcriptional splicing is well studied, co-transcriptional splicing is gaining attention for its efficiency, though its unpredictability still remains a challenge. In this paper, we focus on engineering co-transcriptional splicing, not only as a natural phenomenon but as a programmable mechanism for generating specific RNA target sequences from DNA templates. The problem we address is whether we can encode a set of RNA sequences for a given system onto a DNA template word, ensuring that all the sequences are generated through co-transcriptional splicing. Given that finding the optimal encoding has been shown to be NP-complete under the various energy models considered, we propose a practical alternative approach under the logarithmic energy model. More specifically, we provide a construction that encodes an arbitrary nondeterministic finite automaton (NFA) into a circular DNA template from which co-transcriptional splicing produces all sequences accepted by the NFA. As all finite languages can be efficiently encoded as NFA, this framework solves the problem of finding small DNA templates for arbitrary target sets of RNA sequences. The quest to obtain the smallest possible such templates naturally leads us to consider the problem of minimizing NFA and certain practically motivated variants of it, but as we show, those minimization problems are computationally intractable.
Problem

Research questions and friction points this paper is trying to address.

Engineering co-transcriptional splicing for programmable RNA sequences
Encoding RNA sequences onto DNA templates via co-transcriptional splicing
Minimizing NFA for efficient DNA template design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Programmable co-transcriptional splicing via hairpin deletion
Encoding NFAs into circular DNA templates
Logarithmic energy model for efficient RNA generation
🔎 Similar Papers
No similar papers found.