🤖 AI Summary
Low node representation quality in heterogeneous graphs due to severe label scarcity. Method: We propose a three-view self-supervised contrastive learning framework that jointly models node attributes, low-order structure (1-hop neighborhoods), and high-order structure (multi-hop semantic paths). Contribution/Results: This is the first work to achieve joint multi-scale structural and attribute modeling in heterogeneous graph contrastive learning. We design an attribute-enhanced positive sample selection strategy to mitigate bias from purely structure-driven sampling, introduce an attribute-structure joint similarity metric, and incorporate a high-order neighborhood aggregation mechanism. Extensive experiments on four real-world heterogeneous graph datasets demonstrate that our method significantly outperforms state-of-the-art unsupervised approaches, with several metrics even surpassing supervised baselines—validating its effectiveness and generalizability under extreme label scarcity.
📝 Abstract
Heterogeneous graphs (HGs) are composed of multiple types of nodes and edges, making it more effective in capturing the complex relational structures inherent in the real world. However, in real-world scenarios, labeled data is often difficult to obtain, which limits the applicability of semi-supervised approaches. Self-supervised learning aims to enable models to automatically learn useful features from data, effectively addressing the challenge of limited labeling data. In this paper, we propose a novel contrastive learning framework for heterogeneous graphs (ASHGCL), which incorporates three distinct views, each focusing on node attributes, high-order and low-order structural information, respectively, to effectively capture attribute information, high-order structures, and low-order structures for node representation learning. Furthermore, we introduce an attribute-enhanced positive sample selection strategy that combines both structural information and attribute information, effectively addressing the issue of sampling bias. Extensive experiments on four real-world datasets show that ASHGCL outperforms state-of-the-art unsupervised baselines and even surpasses some supervised benchmarks.