🤖 AI Summary
This work resolves two long-standing open problems regarding the computational complexity of partition function (PF) computation for nucleic acid secondary structures: (1) whether PF computation for infinite-chain systems is #P-hard; and (2) whether PF computation for single-stranded systems permitting pseudoknots is #P-hard. Leveraging a novel “amplification reduction” technique, we provide the first rigorous proof that both problems are #P-hard. Furthermore, we establish a tight polynomial-time reduction chain among five fundamental thermodynamic problems—partition function, minimum free energy, structure enumeration, base-pair probability, and Boltzmann sampling—demonstrating their computational equivalence under standard energy models. Our results not only confirm that PF computation is at least as hard as minimum free energy prediction, but also unify the computational characterization of nucleic acid energy models. This yields a general classification framework that delineates the theoretical limits of RNA secondary structure prediction.
📝 Abstract
To understand and engineer biological and artificial nucleic acid systems, algorithms are employed for prediction of secondary structures at thermodynamic equilibrium. Dynamic programming algorithms are used to compute the most favoured, or Minimum Free Energy (MFE), structure, and the Partition Function (PF), a tool for assigning a probability to any structure. However, in some situations, such as when there are large numbers of strands, or pseudoknoted systems, NP-hardness results show that such algorithms are unlikely, but only for MFE. Curiously, algorithmic hardness results were not shown for PF, leaving two open questions on the complexity of PF for multiple strands and single strands with pseudoknots. The challenge is that while the MFE problem cares only about one, or a few structures, PF is a summation over the entire secondary structure space, giving theorists the vibe that computing PF should not only be as hard as MFE, but should be even harder.
We answer both questions. First, we show that computing PF is #P-hard for systems with an unbounded number of strands, answering a question of Condon Hajiaghayi, and Thachuk [DNA27]. Second, for even a single strand, but allowing pseudoknots, we find that PF is #P-hard. Our proof relies on a novel magnification trick that leads to a tightly-woven set of reductions between five key thermodynamic problems: MFE, PF, their decision versions, and #SSEL that counts structures of a given energy. Our reductions show these five problems are fundamentally related for any energy model amenable to magnification. That general classification clarifies the mathematical landscape of nucleic acid energy models and yields several open questions.