🤖 AI Summary
Existing graph label noise models (e.g., SLN, CCN) assume node label errors are independent, ignoring how structural dependencies—particularly edge relationships—influence noise generation. Method: We propose the first Edge-Dependent Noise (EDN) model, incorporating topological features (e.g., node degree) into label noise modeling and revealing a quantitative relationship between label error probability and node degree. EDN establishes the first graph-structure-aware noise generation mechanism and defines three variants with distinct structural biases. Contribution/Results: Extensive experiments across five GNN architectures and eight robust learning algorithms demonstrate that two EDN variants significantly degrade model performance (p < 0.01), confirming their heightened realism and challenge. This work provides a more realistic noise benchmark and theoretical foundation for robust graph learning research.
📝 Abstract
An important structural feature of a graph is its set of edges, as it captures the relationships among the nodes (the graph's topology). Existing node label noise models like Symmetric Label Noise (SLN) and Class Conditional Noise (CCN) disregard this important node relationship in graph data; and the Edge-Dependent Noise (EDN) model addresses this limitation. EDN posits that in real-world scenarios, label noise may be influenced by the connections between nodes. We explore three variants of EDN. A crucial notion that relates nodes and edges in a graph is the degree of a node; we show that in all three variants, the probability of a node's label corruption is dependent on its degree. Additionally, we compare the dependence of these probabilities on node degree across different variants. We performed experiments on popular graph datasets using 5 different GNN architectures and 8 noise robust algorithms for graph data. The results demonstrate that 2 variants of EDN lead to greater performance degradation in both Graph Neural Networks (GNNs) and existing noise-robust algorithms, as compared to traditional node label noise models. We statistically verify this by posing a suitable hypothesis-testing problem. This emphasizes the importance of incorporating EDN when evaluating noise robust algorithms for graphs, to enhance the reliability of graph-based learning in noisy environments.