🤖 AI Summary
This work addresses the limitations of existing RNA interaction prediction methods, which rely on static fusion strategies and fail to capture the dynamic, context-dependent nature of binding processes. To overcome this, we introduce a novel dynamic alignment framework that, for the first time, integrates state space models into multimodal biological large language model (BioLLM) fusion. Our approach employs a bidirectional Mamba encoder to enable context-aware interactions between RNA and protein embeddings through a sequence state transition mechanism. Leveraging BioLLM representations from ESM-2 and RiNALMo, augmented with Gaussian noise injection and optimized with Focal Loss, the method achieves linear computational complexity while significantly enhancing performance: it attains an MCC of 0.892 on RPI1460, surpassing the current state-of-the-art by 5.2%, and achieves Pearson correlation coefficients exceeding 0.95 in both riboswitch and repetitive RNA affinity prediction tasks.
📝 Abstract
Accurate prediction of RNA-associated interactions is essential for understanding cellular regulation and advancing drug discovery. While Biological Large Language Models (BioLLMs) such as ESM-2 and RiNALMo provide powerful sequence representations, existing methods rely on static fusion strategies that fail to capture the dynamic, context-dependent nature of molecular binding. We introduce CrossLLM-Mamba, a novel framework that reformulates interaction prediction as a state-space alignment problem. By leveraging bidirectional Mamba encoders, our approach enables deep ``crosstalk'' between modality-specific embeddings through hidden state propagation, modeling interactions as dynamic sequence transitions rather than static feature overlaps. The framework maintains linear computational complexity, making it scalable to high-dimensional BioLLM embeddings. We further incorporate Gaussian noise injection and Focal Loss to enhance robustness against hard-negative samples. Comprehensive experiments across three interaction categories, RNA-protein, RNA-small molecule, and RNA-RNA demonstrate that CrossLLM-Mamba achieves state-of-the-art performance. On the RPI1460 benchmark, our model attains an MCC of 0.892, surpassing the previous best by 5.2\%. For binding affinity prediction, we achieve Pearson correlations exceeding 0.95 on riboswitch and repeat RNA subtypes. These results establish state-space modeling as a powerful paradigm for multi-modal biological interaction prediction.