The Oversmoothing Fallacy: A Misguided Narrative in GNN Research

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper challenges the prevailing consensus in graph neural network (GNN) research that over-smoothing is the primary cause of performance degradation in deep GNNs. Method: The authors systematically decouple the three core operations—aggregation, linear transformation, and nonlinear activation—and conduct theoretical analysis and gradient propagation modeling to isolate their individual effects. They further empirically validate their findings by reproducing and enhancing classical stabilization techniques (e.g., residual connections, normalization). Contribution/Results: The analysis reveals that performance degradation stems predominantly from vanishing gradients induced by linear transformations and nonlinear activations—not from aggregation-induced over-smoothing; moreover, over-smoothing is not intrinsic to GNNs. Well-designed deep GNNs can achieve stable training and avoid degradation. The work establishes a theoretical foundation and practical architectural paradigm for constructing GNNs with hundreds of layers, thereby redefining the design principles for deep GNNs.

Technology Category

Application Category

📝 Abstract
Oversmoothing has been recognized as a main obstacle to building deep Graph Neural Networks (GNNs), limiting the performance. This position paper argues that the influence of oversmoothing has been overstated and advocates for a further exploration of deep GNN architectures. Given the three core operations of GNNs, aggregation, linear transformation, and non-linear activation, we show that prior studies have mistakenly confused oversmoothing with the vanishing gradient, caused by transformation and activation rather than aggregation. Our finding challenges prior beliefs about oversmoothing being unique to GNNs. Furthermore, we demonstrate that classical solutions such as skip connections and normalization enable the successful stacking of deep GNN layers without performance degradation. Our results clarify misconceptions about oversmoothing and shed new light on the potential of deep GNNs.
Problem

Research questions and friction points this paper is trying to address.

Challenges the overstated influence of oversmoothing in GNNs.
Distinguishes oversmoothing from vanishing gradient in GNN operations.
Shows deep GNNs can work with classical solutions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distinguishes oversmoothing from vanishing gradient
Uses skip connections for deep GNNs
Applies normalization to prevent performance drop
MoonJeong Park
MoonJeong Park
graduate student of POSTECH
machine learninggraph neural networkdynamical system
S
Sunghyun Choi
Graduate School of Artificial Intelligence, POSTECH, South Korea
J
Jaeseung Heo
Graduate School of Artificial Intelligence, POSTECH, South Korea
Eunhyeok Park
Eunhyeok Park
POSTECH
neural network optimizationenergy efficient hardware design
D
Dongwoo Kim
Graduate School of Artificial Intelligence, Department of Computer Science & Engineering, POSTECH, South Korea