Factorization Machine with Quadratic-Optimization Annealing for RNA Inverse Folding and Evaluation of Binary-Integer Encoding and Nucleotide Assignment

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the RNA inverse folding problem—designing nucleotide sequences that stably adopt a target secondary structure while minimizing costly experimental evaluations—by introducing, for the first time, the Factorization Machine with Quadratic Annealing (FMQA) approach to this domain. The authors systematically evaluate 24 integer assignments combined with four binary encoding schemes (one-hot, domain-wall, binary, and unary) by mapping nucleotides to binary variables. Their analysis reveals that one-hot and domain-wall encodings yield superior performance. Notably, in the domain-wall encoding, positioning G/C nucleotides at boundary integer values enhances stem stability, substantially reducing the normalized ensemble defect and producing thermodynamically more favorable sequences.

Technology Category

Application Category

📝 Abstract
The RNA inverse folding problem aims to identify nucleotide sequences that preferentially adopt a given target secondary structure. While various heuristic and machine learning-based approaches have been proposed, many require a large number of sequence evaluations, which limits their applicability when experimental validation is costly. We propose a method to solve the problem using a factorization machine with quadratic-optimization annealing (FMQA). FMQA is a discrete black-box optimization method reported to obtain high-quality solutions with a limited number of evaluations. Applying FMQA to the problem requires converting nucleotides into binary variables. However, the influence of integer-to-nucleotide assignments and binary-integer encoding on the performance of FMQA has not been thoroughly investigated, even though such choices determine the structure of the surrogate model and the search landscape, and thus can directly affect solution quality. Therefore, this study aims both to establish a novel FMQA framework for RNA inverse folding and to analyze the effects of these assignments and encoding methods. We evaluated all 24 possible assignments of the four nucleotides to the ordered integers (0-3), in combination with four binary-integer encoding methods. Our results demonstrated that one-hot and domain-wall encodings outperform binary and unary encodings in terms of the normalized ensemble defect value. In domain-wall encoding, nucleotides assigned to the boundary integers (0 and 3) appeared with higher frequency. In the RNA inverse folding problem, assigning guanine and cytosine to these boundary integers promoted their enrichment in stem regions, which led to more thermodynamically stable secondary structures than those obtained with one-hot encoding.
Problem

Research questions and friction points this paper is trying to address.

RNA inverse folding
binary-integer encoding
nucleotide assignment
discrete optimization
secondary structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

Factorization Machine
Quadratic-Optimization Annealing
RNA Inverse Folding
Binary-Integer Encoding
Domain-Wall Encoding
🔎 Similar Papers
No similar papers found.
S
Shuta Kikuchi
Graduate School of Science and Technology, Keio University, Yokohama, Kanagawa 223-8522, Japan; Keio University Sustainable Quantum Artificial Intelligence Center (KSQAIC), Keio University, Minato-ku, Tokyo 108-8345, Japan
Shu Tanaka
Shu Tanaka
Professor, Department of Applied Physics and Physico-Informatics, Keio University
Quantum annealingIsing machineStatistical mechanicsQuantum computationMaterials science