🤖 AI Summary
Biomedical knowledge graph completion faces the challenge of jointly optimizing semantic understanding and structural learning: knowledge embedding excels at global semantics but struggles with dynamic structural modeling, whereas graph neural networks capture local topology effectively yet lack deep semantic representation. This paper proposes SemStrucNet, a novel framework featuring a semantics–structure co-evolution mechanism—employing tensor decomposition to establish a global semantic foundation, guiding an LSTM to dynamically refine relation embeddings; further integrating query-driven subgraph sampling with hybrid scoring that jointly leverages semantic similarity and structural path-based scores. Evaluated on drug–target prediction, disease–gene association, and biological pathway inference, SemStrucNet significantly outperforms state-of-the-art models. A case study successfully identifies biologically relevant pathways associated with cutaneous malignant melanoma, demonstrating both interpretability and discovery capability.
📝 Abstract
Motivation: Biomedical knowledge graphs (KGs) are crucial for drug discovery and disease understanding, yet their completion and reasoning are challenging. Knowledge Embedding (KE) methods capture global semantics but struggle with dynamic structural integration, while Graph Neural Networks (GNNs) excel locally but often lack semantic understanding. Even ensemble approaches, including those leveraging language models, often fail to achieve a deep, adaptive, and synergistic co-evolution between semantic comprehension and structural learning. Addressing this critical gap in fostering continuous, reciprocal refinement between these two aspects in complex biomedical KGs is paramount.
Results: We introduce BioGraphFusion, a novel framework for deeply synergistic semantic and structural learning. BioGraphFusion establishes a global semantic foundation via tensor decomposition, guiding an LSTM-driven mechanism to dynamically refine relation embeddings during graph propagation. This fosters adaptive interplay between semantic understanding and structural learning, further enhanced by query-guided subgraph construction and a hybrid scoring mechanism. Experiments across three key biomedical tasks demonstrate BioGraphFusion's superior performance over state-of-the-art KE, GNN, and ensemble models. A case study on Cutaneous Malignant Melanoma 1 (CMM1) highlights its ability to unveil biologically meaningful pathways.
Availability and Implementation: Source code and all training data are freely available for download at https://github.com/Y-TARL/BioGraphFusion.
Contact: zjw@zjut.edu.cn, botao666666@126.com.
Supplementary information: Supplementary data are available at Bioinformatics online.