Trustworthy Neighborhoods Mining: Homophily-Aware Neutral Contrastive Learning for Graph Clustering

📅 2025-12-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing graph contrastive learning methods rely on the homophily assumption, leading to insufficient discriminability of node representations on low-homophily graphs due to unreliable neighbor labels or features. To address this, we propose NeuCGC, a neutral contrastive graph clustering framework that abandons the conventional binary positive/negative sample partitioning and instead constructs weighted neutral node pairs adaptable to diverse graph homophily levels. Our key innovations include: (1) a homophily-adaptive neighborhood distribution alignment module, and (2) a high-confidence neighborhood feature consistency learning module. Extensive experiments on multiple real-world low-homophily graphs demonstrate that NeuCGC significantly outperforms state-of-the-art methods in clustering performance and exhibits strong robustness. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Recently, neighbor-based contrastive learning has been introduced to effectively exploit neighborhood information for clustering. However, these methods rely on the homophily assumption-that connected nodes share similar class labels and should therefore be close in feature space-which fails to account for the varying homophily levels in real-world graphs. As a result, applying contrastive learning to low-homophily graphs may lead to indistinguishable node representations due to unreliable neighborhood information, making it challenging to identify trustworthy neighborhoods with varying homophily levels in graph clustering. To tackle this, we introduce a novel neighborhood Neutral Contrastive Graph Clustering method, NeuCGC, that extends traditional contrastive learning by incorporating neutral pairs-node pairs treated as weighted positive pairs, rather than strictly positive or negative. These neutral pairs are dynamically adjusted based on the graph's homophily level, enabling a more flexible and robust learning process. Leveraging neutral pairs in contrastive learning, our method incorporates two key components: (1) an adaptive contrastive neighborhood distribution alignment that adjusts based on the homophily level of the given attribute graph, ensuring effective alignment of neighborhood distributions, and (2) a contrastive neighborhood node feature consistency learning mechanism that leverages reliable neighborhood information from high-confidence graphs to learn robust node representations, mitigating the adverse effects of varying homophily levels and effectively exploiting highly trustworthy neighborhood information. Experimental results demonstrate the effectiveness and robustness of our approach, outperforming other state-of-the-art graph clustering methods. Our code is available at https://github.com/THPengL/NeuCGC.
Problem

Research questions and friction points this paper is trying to address.

Addresses unreliable neighborhoods in low-homophily graphs for clustering
Introduces neutral pairs to flexibly handle varying homophily levels
Ensures robust node representation via adaptive contrastive learning mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neutral pairs dynamically adjust based on homophily levels
Adaptive contrastive neighborhood distribution alignment mechanism
Contrastive neighborhood node feature consistency learning
🔎 Similar Papers
No similar papers found.
L
Liang Peng
Department of Computer Science, Shantou University
Yixuan Ye
Yixuan Ye
Data Scientist - Research, Google LLC
Statistical ModelingGenetic Prediction
C
Cheng Liu
College of Computer Science and Technology, Huaqiao University and the Department of Computer Science, Shantou University
H
Hangjun Che
Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
M
Man-Fai Leung
School of Computing and Information Science, Faculty of Science and Engineering, Anglia Ruskin University, Cambridge, U.K.
S
Si Wu
School of Computer Science and Engineering, South China University of Technology
Hau-San Wong
Hau-San Wong
Professor, Department of Computer Science, City University of Hong Kong
Artificial IntelligenceMachine learningData miningBioinformatics