🤖 AI Summary
Existing graph neural network (GNN) adversarial attacks lack interpretable, quantitative characterization of individual perturbation strength, resulting in opaque, black-box attack processes.
Method: This paper introduces the novel concept of “structural noise,” formally defining the adversarial edge’s attack strength as its perturbation effect on the node classification margin. Based on this, we propose a noise-driven, interpretable attack framework. Our method designs single-step and multi-step attack strategies grounded in the interplay between structural noise and classification margin, uncovering topological principles—namely, that high-centrality and low-homophily nodes are more susceptible to critical perturbations.
Contribution/Results: Extensive experiments across multiple benchmark datasets and three mainstream GNN architectures demonstrate that our framework significantly improves attack effectiveness while providing theoretically grounded, interpretable guidance for perturbation selection. It offers a new perspective for analyzing GNN robustness and informs principled defense design.
📝 Abstract
Graph neural networks have been widely utilized to solve graph-related tasks because of their strong learning power in utilizing the local information of neighbors. However, recent studies on graph adversarial attacks have proven that current graph neural networks are not robust against malicious attacks. Yet much of the existing work has focused on the optimization objective based on attack performance to obtain (near) optimal perturbations, but paid less attention to the strength quantification of each perturbation such as the injection of a particular node/link, which makes the choice of perturbations a black-box model that lacks interpretability. In this work, we propose the concept of noise to quantify the attack strength of each adversarial link. Furthermore, we propose three attack strategies based on the defined noise and classification margins in terms of single and multiple steps optimization. Extensive experiments conducted on benchmark datasets against three representative graph neural networks demonstrate the effectiveness of the proposed attack strategies. Particularly, we also investigate the preferred patterns of effective adversarial perturbations by analyzing the corresponding properties of the selected perturbation nodes.