🤖 AI Summary
Despite the widespread assumption that background knowledge (BK), such as protein–protein interaction (PPI) networks, inherently improves graph neural network (GNN) performance in cancer subtype classification, its actual utility and robustness remain empirically unverified. Method: We introduce the first systematic evaluation framework, integrating synthetic interpretable scenarios with real biomedical datasets, and quantify BK sensitivity via diverse graph perturbations—including edge deletion, rewiring, and noise injection. Contribution/Results: Experiments reveal that state-of-the-art BK-augmented GNNs underperform even simple linear regression on real data and exhibit negligible performance degradation under severe BK corruption, indicating ineffective BK utilization. To address this, we propose BK–GNN alignment principles—enforcing structural and semantic consistency between BK and GNN architectures—and demonstrate substantial performance gains. Our work establishes that unlocking BK’s value requires precise co-design of architectural inductive biases and domain priors, providing both theoretical insights and practical guidelines for trustworthy biomedical graph learning.
📝 Abstract
In complex and low-data domains such as biomedical research, incorporating background knowledge (BK) graphs, such as protein-protein interaction (PPI) networks, into graph-based machine learning pipelines is a promising research direction. However, while BK is often assumed to improve model performance, its actual contribution and the impact of imperfect knowledge remain poorly understood. In this work, we investigate the role of BK in an important real-world task: cancer subtype classification. Surprisingly, we find that (i) state-of-the-art GNNs using BK perform no better than uninformed models like linear regression, and (ii) their performance remains largely unchanged even when the BK graph is heavily perturbed. To understand these unexpected results, we introduce an evaluation framework, which employs (i) a synthetic setting where the BK is clearly informative and (ii) a set of perturbations that simulate various imperfections in BK graphs. With this, we test the robustness of BK-aware models in both synthetic and real-world biomedical settings. Our findings reveal that careful alignment of GNN architectures and BK characteristics is necessary but holds the potential for significant performance improvements.