🤖 AI Summary
This work addresses the pervasive over-smoothing problem in diffusion-based graph neural networks (GNNs)—where deep propagation causes node representations to converge and lose discriminability. Grounded in theoretical analysis, we introduce operator semigroup theory to formally characterize over-smoothing as an ergodicity property of the diffusion operator, and derive an explicit analytical form for smoothed features. Building on this insight, we propose a general and relaxed “ergodicity-breaking” condition that unifies the underlying mechanisms of diverse existing mitigation strategies. Furthermore, integrating invariant measure modeling with Dirichlet energy quantification, our framework effectively suppresses feature smoothing—reducing average Dirichlet energy by 23.6%—and improves node classification accuracy by 1.2–4.7 percentage points across multiple benchmark datasets.
📝 Abstract
This paper presents an analytical study of the oversmoothing issue in diffusion-based Graph Neural Networks (GNNs). Generalizing beyond extant approaches grounded in random walk analysis or particle systems, we approach this problem through operator semigroup theory. This theoretical framework allows us to rigorously prove that oversmoothing is intrinsically linked to the ergodicity of the diffusion operator. Relying on semigroup method, we can quantitatively analyze the dynamic of graph diffusion and give a specific mathematical form of the smoothing feature by ergodicity and invariant measure of operator, which improves previous works only show existence of oversmoothing. This finding further poses a general and mild ergodicity-breaking condition, encompassing the various specific solutions previously offered, thereby presenting a more universal and theoretically grounded approach to relieve oversmoothing in diffusion-based GNNs. Additionally, we offer a probabilistic interpretation of our theory, forging a link with prior works and broadening the theoretical horizon. Our experimental results reveal that this ergodicity-breaking term effectively mitigates oversmoothing measured by Dirichlet energy, and simultaneously enhances performance in node classification tasks.