🤖 AI Summary
Unsupervised graph clustering suffers from performance limitations due to the absence of reliable supervisory signals; existing pseudo-labeling methods rely solely on feature centroids to construct a single target distribution, resulting in insufficient clustering guidance and vulnerability to noise. To address this, we propose a neighborhood-distribution-driven dual-centroid optimization framework. First, we explicitly model the neighborhood distribution of each node to identify hard negative samples and generate robust supervisory signals. Second, we jointly optimize both feature centroids and neighborhood centroids to establish a dual-target distribution, thereby enabling more reliable representation learning and cluster assignment. Our method integrates graph neural networks, contrastive learning, pseudo-label correction, and neighborhood distribution modeling. Extensive experiments on multiple benchmark datasets demonstrate significant improvements over state-of-the-art methods, validating both effectiveness and robustness.
📝 Abstract
Graph clustering is crucial for unraveling intricate data structures, yet it presents significant challenges due to its unsupervised nature. Recently, goal-directed clustering techniques have yielded impressive results, with contrastive learning methods leveraging pseudo-label garnering considerable attention. Nonetheless, pseudo-label as a supervision signal is unreliable and existing goal-directed approaches utilize only features to construct a single-target distribution for single-center optimization, which lead to incomplete and less dependable guidance. In our work, we propose a novel Dual-Center Graph Clustering (DCGC) approach based on neighbor distribution properties, which includes representation learning with neighbor distribution and dual-center optimization. Specifically, we utilize neighbor distribution as a supervision signal to mine hard negative samples in contrastive learning, which is reliable and enhances the effectiveness of representation learning. Furthermore, neighbor distribution center is introduced alongside feature center to jointly construct a dual-target distribution for dual-center optimization. Extensive experiments and analysis demonstrate superior performance and effectiveness of our proposed method.