A Novel Framework for Multi-Modal Protein Representation Learning

📅 2025-10-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses two key challenges in protein function prediction: (1) distribution mismatch among intrinsic modalities (e.g., sequence and structure) and extrinsic modalities (e.g., protein–protein interaction networks and Gene Ontology annotations), and (2) noise corruption in extrinsic relational graphs. To tackle these, we propose a unified multimodal representation framework. Its core innovations are: (1) optimal transport (OT)-based alignment to harmonize cross-modal distributions of heterogeneous intrinsic embeddings; and (2) a conditional graph generation (CGG) mechanism that dynamically constructs high-quality contextual graphs to enhance the robustness of graph neural network (GNN) message passing. On standard Gene Ontology (GO) benchmarks, our method achieves consistent improvements—AUPR gains of 0.002–0.013 and F<sub>max</sub> gains of 0.004–0.007—surpassing or matching state-of-the-art methods. Ablation studies confirm the essential contributions of both the OT alignment and CGG modules.

Technology Category

Application Category

📝 Abstract
Accurate protein function prediction requires integrating heterogeneous intrinsic signals (e.g., sequence and structure) with noisy extrinsic contexts (e.g., protein-protein interactions and GO term annotations). However, two key challenges hinder effective fusion: (i) cross-modal distributional mismatch among embeddings produced by pre-trained intrinsic encoders, and (ii) noisy relational graphs of extrinsic data that degrade GNN-based information aggregation. We propose Diffused and Aligned Multi-modal Protein Embedding (DAMPE), a unified framework that addresses these through two core mechanisms. First, we propose Optimal Transport (OT)-based representation alignment that establishes correspondence between intrinsic embedding spaces of different modalities, effectively mitigating cross-modal heterogeneity. Second, we develop a Conditional Graph Generation (CGG)-based information fusion method, where a condition encoder fuses the aligned intrinsic embeddings to provide informative cues for graph reconstruction. Meanwhile, our theoretical analysis implies that the CGG objective drives this condition encoder to absorb graph-aware knowledge into its produced protein representations. Empirically, DAMPE outperforms or matches state-of-the-art methods such as DPFunc on standard GO benchmarks, achieving AUPR gains of 0.002-0.013 pp and Fmax gains 0.004-0.007 pp. Ablation studies further show that OT-based alignment contributes 0.043-0.064 pp AUPR, while CGG-based fusion adds 0.005-0.111 pp Fmax. Overall, DAMPE offers a scalable and theoretically grounded approach for robust multi-modal protein representation learning, substantially enhancing protein function prediction.
Problem

Research questions and friction points this paper is trying to address.

Addresses cross-modal distribution mismatch in protein embeddings
Mitigates noisy relational graphs in extrinsic protein data
Enhances protein function prediction through unified representation learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns protein embeddings using Optimal Transport
Fuses data via Conditional Graph Generation method
Integrates intrinsic and extrinsic protein information
R
Runjie Zheng
School of Computer Science and Engineering, Sun Yat-sen University (SYSU), No. 132, Outer Ring East Road, University Town, Panyu District, Guangzhou, 510006, Guangdong, China
Z
Zhen Wang
School of Computer Science and Engineering, Sun Yat-sen University (SYSU), No. 132, Outer Ring East Road, University Town, Panyu District, Guangzhou, 510006, Guangdong, China
A
Anjie Qiao
School of Computer Science and Engineering, Sun Yat-sen University (SYSU), No. 132, Outer Ring East Road, University Town, Panyu District, Guangzhou, 510006, Guangdong, China
J
Jiancong Xie
School of Computer Science and Engineering, Sun Yat-sen University (SYSU), No. 132, Outer Ring East Road, University Town, Panyu District, Guangzhou, 510006, Guangdong, China
Jiahua Rao
Jiahua Rao
Sun Yat-sen University
AI4ScienceMulti-scale Learning
Y
Yuedong Yang
School of Computer Science and Engineering, Sun Yat-sen University (SYSU), No. 132, Outer Ring East Road, University Town, Panyu District, Guangzhou, 510006, Guangdong, China