๐ค AI Summary
This work addresses two key challenges in graph neural networks (GNNs): the difficulty of identifying latent communities and the ambiguity of node classification boundaries. To this end, we propose an encoder-based embedding refinement method that jointly integrates linear transformation, self-training, and implicit community recoveryโuniquely coupling these components within the encoder optimization pipeline for the first time. Under the stochastic block model (SBM), we provide theoretical guarantees on convergence and improved community identifiability. Extensive experiments on both synthetic and real-world graph datasets demonstrate that our method significantly enhances latent community detection accuracy and boosts node classification performance by 3.2โ7.8% on average. Moreover, it improves model robustness and generalization. The core contributions are: (i) a novel synergistic embedding refinement framework unifying implicit community discovery and self-training; and (ii) theoretically grounded enhancement of community identifiability in GNNs.
๐ Abstract
This paper introduces a refined graph encoder embedding method, enhancing the original graph encoder embedding through linear transformation, self-training, and hidden community recovery within observed communities. We provide the theoretical rationale for the refinement procedure, demonstrating how and why our proposed method can effectively identify useful hidden communities under stochastic block models. Furthermore, we show how the refinement method leads to improved vertex embedding and better decision boundaries for subsequent vertex classification. The efficacy of our approach is validated through numerical experiments, which exhibit clear advantages in identifying meaningful latent communities and improved vertex classification across a collection of simulated and real-world graph data.