🤖 AI Summary
This paper addresses unsupervised node clustering on large-scale graphs with massive missing node attributes. To tackle attribute incompleteness, we propose CMV-ND—a novel framework centered on a multi-view neighborhood differentiation strategy: it recursively explores neighborhoods across hop distances to achieve complete and non-redundant coverage, then constructs K+1 complementary structural views via neighborhood differencing, thereby mitigating representation degradation induced by missing attributes. Furthermore, we design a multi-view clustering fusion mechanism that jointly optimizes with deep graph clustering objectives. Extensive experiments on six benchmark datasets demonstrate that CMV-ND consistently improves clustering performance over diverse baselines—achieving an average 5.2% gain in accuracy (ACC)—while exhibiting strong scalability and robustness to varying missingness patterns. CMV-ND establishes a new paradigm for large-scale graph analysis under severe attribute incompleteness.
📝 Abstract
Deep graph clustering (DGC), which aims to unsupervisedly separate the nodes in an attribute graph into different clusters, has seen substantial potential in various industrial scenarios like community detection and recommendation. However, the real-world attribute graphs, e.g., social networks interactions, are usually large-scale and attribute-missing. To solve these two problems, we propose a novel DGC method termed underline{ extbf{C}}omplementary underline{ extbf{M}}ulti-underline{ extbf{V}}iew underline{ extbf{N}}eighborhood underline{ extbf{D}}ifferentiation ( extit{CMV-ND}), which preprocesses graph structural information into multiple views in a complete but non-redundant manner. First, to ensure completeness of the structural information, we propose a recursive neighborhood search that recursively explores the local structure of the graph by completely expanding node neighborhoods across different hop distances. Second, to eliminate the redundancy between neighborhoods at different hops, we introduce a neighborhood differential strategy that ensures no overlapping nodes between the differential hop representations. Then, we construct $K+1$ complementary views from the $K$ differential hop representations and the features of the target node. Last, we apply existing multi-view clustering or DGC methods to the views. Experimental results on six widely used graph datasets demonstrate that CMV-ND significantly improves the performance of various methods.