Multimodal Mixture-of-Experts with Retrieval Augmentation for Protein Active Site Identification

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work proposes MERA, a novel framework addressing two critical challenges in protein active site prediction: the fragility of single-instance predictions due to sparse training data and performance degradation caused by unreliable modalities dominating multimodal fusion. MERA introduces, for the first time, a retrieval-augmented mechanism that dynamically aggregates contextual information through a hierarchical multi-expert retrieval system operating at the chain, sequence, and active site levels. It further integrates residue-level mixture-of-experts gating with a modality reliability assessment grounded in Dempster–Shafer evidence theory to enable robust multimodal fusion. Evaluated on the ProTAD-Gen and TS125 datasets, MERA achieves 90% AUPRC and significantly outperforms existing methods in peptide binding site identification, demonstrating its effectiveness and state-of-the-art performance.

Technology Category

Application Category

📝 Abstract

Accurate identification of protein active sites at the residue level is crucial for understanding protein function and advancing drug discovery. However, current methods face two critical challenges: vulnerability in single-instance prediction due to sparse training data, and inadequate modality reliability estimation that leads to performance degradation when unreliable modalities dominate fusion processes. To address these challenges, we introduce Multimodal Mixture-of-Experts with Retrieval Augmentation (MERA), the first retrieval-augmented framework for protein active site identification. MERA employs hierarchical multi-expert retrieval that dynamically aggregates contextual information from chain, sequence, and active-site perspectives through residue-level mixture-of-experts gating. To prevent modality degradation, we propose a reliability-aware fusion strategy based on Dempster-Shafer evidence theory that quantifies modality trustworthiness through belief mass functions and learnable discounting coefficients, enabling principled multimodal integration. Extensive experiments on ProTAD-Gen and TS125 datasets demonstrate that MERA achieves state-of-the-art performance, with 90% AUPRC on active site prediction and significant gains on peptide-binding site identification, validating the effectiveness of retrieval-augmented multi-expert modeling and reliability-guided fusion.

Problem

Research questions and friction points this paper is trying to address.

protein active site identification

multimodal fusion

modality reliability

sparse training data

residue-level prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-augmented learning

mixture-of-experts

multimodal fusion