Rethinking Semi-Supervised Node Classification with Self-Supervised Graph Clustering

๐Ÿ“… 2025-11-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address label scarcity in semi-supervised node classification, this paper unifies self-supervised graph clustering with node classification, leveraging intrinsic community structure as a supervisory signal for enhanced representation learning. Methodologically, we propose the Soft Orthogonal Graph Network (SOGN), which jointly optimizes classification and clustering via a dual-clustering objective and Sinkhornโ€“Knopp normalization to generate balanced soft pseudo-labels. The framework integrates supervised classification loss with unsupervised clustering loss and is compatible with various GNN backbones. Theoretically, we establish, for the first time, a unified connection between the GNN optimization objective and spectral clustering. Extensive experiments on seven real-world graph benchmarks demonstrate significant improvements over state-of-the-art methods, with strong generalization capability and high training stability.

Technology Category

Application Category

๐Ÿ“ Abstract
The emergence of graph neural networks (GNNs) has offered a powerful tool for semi-supervised node classification tasks. Subsequent studies have achieved further improvements through refining the message passing schemes in GNN models or exploiting various data augmentation techniques to mitigate limited supervision. In real graphs, nodes often tend to form tightly-knit communities/clusters, which embody abundant signals for compensating label scarcity in semi-supervised node classification but are not explored in prior methods. Inspired by this, this paper presents NCGC that integrates self-supervised graph clustering and semi-supervised classification into a unified framework. Firstly, we theoretically unify the optimization objectives of GNNs and spectral graph clustering, and based on that, develop soft orthogonal GNNs (SOGNs) that leverage a refined message passing paradigm to generate node representations for both classification and clustering. On top of that, NCGC includes a self-supervised graph clustering module that enables the training of SOGNs for learning representations of unlabeled nodes in a self-supervised manner. Particularly, this component comprises two non-trivial clustering objectives and a Sinkhorn-Knopp normalization that transforms predicted cluster assignments into balanced soft pseudo-labels. Through combining the foregoing clustering module with the classification model using a multi-task objective containing the supervised classification loss on labeled data and self-supervised clustering loss on unlabeled data, NCGC promotes synergy between them and achieves enhanced model capacity. Our extensive experiments showcase that the proposed NCGC framework consistently and considerably outperforms popular GNN models and recent baselines for semi-supervised node classification on seven real graphs, when working with various classic GNN backbones.
Problem

Research questions and friction points this paper is trying to address.

Integrates self-supervised graph clustering with semi-supervised node classification
Leverages community structures to compensate for limited labeled data
Develops unified framework combining clustering and classification objectives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates self-supervised clustering with classification
Uses soft orthogonal GNNs for dual representation learning
Employs multi-task objective combining supervised and self-supervised losses
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Songbo Wang
The University of Hong Kong
R
Renchi Yang
Hong Kong Baptist University
Y
Yurui Lai
Hong Kong Baptist University
X
Xiaoyang Lin
Hong Kong Baptist University
Tsz Nam Chan
Tsz Nam Chan
Shenzhen University (Distinguished Professor)
Kernel MethodsSimilarity MeasuresSpatial and Temporal DataGIS