Differentiable Folding for Nearest Neighbor Model Optimization

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently fitting approximately 13,000 parameters in the RNA nearest-neighbor thermodynamic model to experimental and structural data. We introduce the first differentiable RNA folding framework, built upon classic algorithms—including Nussinov and maximum expected accuracy (MEA)—to enable automatic differentiation and gradient backpropagation. The framework supports end-to-end, gradient-driven parameter optimization, joint training on heterogeneous data sources (e.g., RNAometer stability measurements and structural annotations), flexible loss design, and integration with deep learning pipelines. The learned parameter set achieves over 23 orders-of-magnitude higher predicted probabilities on real RNA family sequence–structure pairs compared to Turner2004, significantly outperforming existing baselines. This establishes a new paradigm for RNA secondary structure prediction: thermodynamically grounded, highly accurate, interpretable, and scalable.

Technology Category

Application Category

📝 Abstract
The Nearest Neighbor model is the $ extit{de facto}$ thermodynamic model of RNA secondary structure formation and is a cornerstone of RNA structure prediction and sequence design. The current functional form (Turner 2004) contains $approx13,000$ underlying thermodynamic parameters, and fitting these to both experimental and structural data is computationally challenging. Here, we leverage recent advances in $ extit{differentiable folding}$, a method for directly computing gradients of the RNA folding algorithms, to devise an efficient, scalable, and flexible means of parameter optimization that uses known RNA structures and thermodynamic experiments. Our method yields a significantly improved parameter set that outperforms existing baselines on all metrics, including an increase in the average predicted probability of ground-truth sequence-structure pairs for a single RNA family by over 23 orders of magnitude. Our framework provides a path towards drastically improved RNA models, enabling the flexible incorporation of new experimental data, definition of novel loss terms, large training sets, and even treatment as a module in larger deep learning pipelines. We make available a new database, RNAometer, with experimentally-determined stabilities for small RNA model systems.
Problem

Research questions and friction points this paper is trying to address.

Optimizing thermodynamic parameters for RNA secondary structure prediction.
Improving RNA model accuracy using differentiable folding techniques.
Enhancing RNA sequence-structure prediction with new experimental data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable folding for RNA parameter optimization
Scalable method using RNA structures and experiments
Improved RNA models with new experimental data
🔎 Similar Papers
No similar papers found.
R
Ryan K. Krueger
School of Engineering and Applied Sciences, Harvard University
Sharon Aviran
Sharon Aviran
Associate Professor at UC Davis
Computational GenomicsRNA BiologySignal Processing
David H. Mathews
David H. Mathews
Department of Biochemistry and Biophysics, University of Rochester Medical Center
Computational BiologyRNARNA Biology
J
Jeffrey Zuber
Department of Biochemistry and Biophysics and Center for RNA Biology, University of the Rochester Medical Center
Max Ward
Max Ward
University of Western Australia
algorithmscomputational biologyRNAAI