PAIR-Former: Budgeted Relational MIL for miRNA Target Prediction

πŸ“… 2026-01-31
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of miRNA–mRNA target prediction, where an enormous number of candidate binding sites coexists with scarce pairwise labels and limited computational resources. The authors propose the first Budgeted Relational Multi-Instance Learning (BR-MIL) framework, which operates under a strict computational budget by efficiently encoding and modeling relationships among at most K candidate sites. The approach employs a two-stage pipeline: a low-cost scan over the entire candidate pool followed by a diversity-based selection of K sites on CPU, whose interdependencies are captured via a permutation-invariant Set Transformer aggregator. Evaluated on the miRAW dataset under a practical budget of K=64, the method significantly outperforms strong baselines while achieving high accuracy, controllable computational overhead, and theoretical guarantees, thereby demonstrating its effectiveness and scalability.

Technology Category

Application Category

πŸ“ Abstract
Functional miRNA--mRNA targeting is a large-bag prediction problem: each transcript yields a heavy-tailed pool of candidate target sites (CTSs), yet only a pair-level label is observed. We formalize this regime as \emph{Budgeted Relational Multi-Instance Learning (BR-MIL)}, where at most $K$ instances per bag may receive expensive encoding and relational processing under a hard compute budget. We propose \textbf{PAIR-Former} (Pool-Aware Instance-Relational Transformer), a BR-MIL pipeline that performs a cheap full-pool scan, selects up to $K$ diverse CTSs on CPU, and applies a permutation-invariant Set Transformer aggregator on the selected tokens. On miRAW, PAIR-Former outperforms strong pooling baselines at a practical operating budget ($K^\star{=}64$) while providing a controllable accuracy--compute trade-off as $K$ varies. We further provide theory linking budgeted selection to (i) approximation error decreasing with $K$ and (ii) generalization terms governed by $K$ in the expensive relational component.
Problem

Research questions and friction points this paper is trying to address.

miRNA target prediction
Budgeted Relational Multi-Instance Learning
candidate target sites
compute budget
large-bag prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Budgeted Relational MIL
PAIR-Former
miRNA target prediction
Set Transformer
compute-efficient selection
πŸ”Ž Similar Papers
No similar papers found.
Jiaqi Yin
Jiaqi Yin
University of Maryland
EDALogic SynthesisFormal Verification
B
Baiming Chen
School of Medicine, Chinese University of Hong Kong, Shenzhen, Shenzhen, China
J
Jia Fei
Department of Biochemistry and Molecular Biology, Medical College, Jinan University, Guangzhou, China; Guangdong Engineering Technology Research Center of Drug Development for Small Nucleic Acids, Guangzhou, China; State Key Laboratory of Bioactive Molecules and Druggability Assessment, Jinan University, Guangzhou, China
M
Mingjun Yang
Shenzhen Jingtai Technology Co., Ltd. (XtalPi), Shenzhen, China