Comprehensive Metapath-based Heterogeneous Graph Transformer for Gene-Disease Association Prediction

📅 2024-12-03
🏛️ IEEE International Conference on Bioinformatics and Biomedicine
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high experimental cost and insufficient semantic modeling in gene–disease association (GDA) prediction, this paper proposes a heterogeneous graph learning framework integrating multi-source biological data. First, a heterogeneous graph is constructed and node features are initialized using BioGPT. Second, seven semantic meta-paths are designed, and a meta-path-aware Transformer is introduced to capture long-range dependencies. Third, a novel dual-level attention aggregation mechanism—operating both intra- and inter-meta-path—is proposed to jointly encode heterogeneous structure, semantic paths, and node features. On multiple benchmark datasets, the method achieves 3.2–5.8% AUC improvements over state-of-the-art approaches, with significantly enhanced robustness. Ablation and visualization studies validate the effectiveness of semantic modeling and cross-path feature fusion. This work is the first to synergistically integrate BioGPT embeddings with meta-path-guided Transformers for GDA prediction.

Technology Category

Application Category

📝 Abstract
Discovering gene-disease associations is crucial for understanding disease mechanisms, yet identifying these associations remains challenging due to the time and cost of biological experiments. Computational methods are increasingly vital for efficient and scalable gene-disease association prediction. Graph-based learning models, which leverage node features and network relationships, are commonly employed for biomolecular predictions. However, existing methods often struggle to effectively integrate node features, heterogeneous structures, and semantic information. To address these challenges, we propose COmprehensive MEtapath-based heterogeneous graph Transformer(COMET) for predicting gene-disease associations. COMET integrates diverse datasets to construct comprehensive heterogeneous networks, initializing node features with BioGPT. We define seven Metapaths and utilize a transformer framework to aggregate Metapath instances, capturing global contexts and long-distance dependencies. Through intra- and inter-metapath aggregation using attention mechanisms, COMET fuses latent vectors from multiple Metapaths to enhance GDA prediction accuracy. Our method demonstrates superior robustness compared to state-of-the-art approaches. Ablation studies and visualizations validate COMET’s effectiveness, providing valuable insights for advancing human health research.
Problem

Research questions and friction points this paper is trying to address.

Gene-Disease Prediction
Efficiency Improvement
Accuracy Enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

COMET
heterogeneous graph Transformer
gene-disease prediction
🔎 Similar Papers
No similar papers found.
W
Wentao Cui
Computer Network Information Center, Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China; Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, China
S
Shoubo Li
Computer Network Information Center, Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China
Chen Fang
Chen Fang
Research Scientist@Adobe Research
Computer VisionMachine Learning
Q
Qingqing Long
Computer Network Information Center, Chinese Academy of Sciences, China; University of Chinese Academy of Sciences, China
Chengrui Wang
Chengrui Wang
Alibaba Group
Computer Vision
Xuezhi Wang
Xuezhi Wang
Research Scientist, Google DeepMind
Machine LearningNatural Language Processing
Yuanchun Zhou
Yuanchun Zhou
Computer Network Information Center,CAS
Data MiningBig Data Analysis