Rethinking Graph Generalization through the Lens of Sharpness-Aware Minimization

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of graph neural networks (GNNs) to minimal shift flipping (MSF)—a phenomenon where minor out-of-distribution perturbations induce misclassification under distributional shifts. From the perspective of Sharpness-Aware Minimization (SAM), the study establishes, for the first time, a theoretical connection between the local robustness radius and generalization error in graph learning, and proposes an energy-based function as a computable proxy for this radius. Building upon this insight, the authors introduce E2A, an energy-driven generative augmentation framework that leverages the energy landscape to guide the generation of pseudo out-of-distribution samples, thereby enhancing model robustness. Extensive experiments demonstrate that E2A significantly outperforms existing methods across multiple benchmarks, effectively mitigating the MSF issue and consistently improving the out-of-distribution generalization capability of GNNs.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) have achieved remarkable success across various graph-based tasks but remain highly sensitive to distribution shifts. In this work, we focus on a prevalent yet under-explored phenomenon in graph generalization, Minimal Shift Flip (MSF),where test samples that slightly deviate from the training distribution are abruptly misclassified. To interpret this phenomenon, we revisit MSF through the lens of Sharpness-Aware Minimization (SAM), which characterizes the local stability and sharpness of the loss landscape while providing a theoretical foundation for modeling generalization error. To quantify loss sharpness, we introduce the concept of Local Robust Radius, measuring the smallest perturbation required to flip a prediction and establishing a theoretical link between local stability and generalization. Building on this perspective, we further observe a continual decrease in the robust radius during training, indicating weakened local stability and an increasingly sharp loss landscape that gives rise to MSF. To jointly solve the MSF phenomenon and the intractability of radius, we develop an energy-based formulation that is theoretically proven to be monotonically correlated with the robust radius, offering a tractable and principled objective for modeling flatness and stability. Building on these insights, we propose an energy-driven generative augmentation framework (E2A) that leverages energy-guided latent perturbations to generate pseudo-OOD samples and enhance model generalization. Extensive experiments across multiple benchmarks demonstrate that E2A consistently improves graph OOD generalization, outperforming state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Graph Generalization
Distribution Shift
Minimal Shift Flip
Out-of-Distribution
Loss Sharpness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sharpness-Aware Minimization
Graph Generalization
Local Robust Radius
Energy-based Augmentation
Out-of-Distribution Generalization
🔎 Similar Papers
No similar papers found.
Y
Yang Qiu
School of Computer Science and Technology, Huazhong University of Science and Technology
Yixiong Zou
Yixiong Zou
Huazhong University of Science and Technology
Computer visionDomain generalizationFew-shot learningVision-language model
J
Jun Wang
iWudao Tech