🤖 AI Summary
Graph Neural Networks (GNNs) are susceptible to spurious correlations between node features and labels in training data, which impairs their generalization performance both in-distribution (IID) and out-of-distribution (OOD). To address this issue, this work proposes SCL-GNN, a novel framework that, for the first time, leverages the Hilbert-Schmidt Independence Criterion (HSIC) to quantify spurious associations between graph representations and class scores. SCL-GNN employs an end-to-end trainable bilevel optimization mechanism to jointly learn debiased representations and GNN parameters. Extensive experiments on multiple real-world and synthetic graph datasets demonstrate that SCL-GNN consistently outperforms existing methods across various distribution shift settings, exhibiting superior generalization capability and robustness.
📝 Abstract
Graph Neural Networks (GNNs) have demonstrated remarkable success across diverse tasks. However, their generalization capability is often hindered by spurious correlations between node features and labels in the graph. Our analysis reveals that GNNs tend to exploit imperceptible statistical correlations in training data, even when such correlations are unreliable for prediction. To address this challenge, we propose the Spurious Correlation Learning Graph Neural Network (SCL-GNN), a novel framework designed to enhance generalization on both Independent and Identically Distributed (IID) and Out-of-Distribution (OOD) graphs. SCL-GNN incorporates a principled spurious correlation learning mechanism, leveraging the Hilbert-Schmidt Independence Criterion (HSIC) to quantify correlations between node representations and class scores. This enables the model to identify and mitigate irrelevant but influential spurious correlations effectively. Additionally, we introduce an efficient bi-level optimization strategy to jointly optimize modules and GNN parameters, preventing overfitting. Extensive experiments on real-world and synthetic datasets demonstrate that SCL-GNN consistently outperforms state-of-the-art baselines under various distribution shifts, highlighting its robustness and generalization capabilities.