CrossHGL: A Text-Free Foundation Model for Cross-Domain Heterogeneous Graph Learning

📅 2026-03-29

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the limitations of existing heterogeneous graph representation learning methods, which are constrained by the closed-world assumption and reliance on a shared feature space, hindering cross-domain generalization under text-free and few-shot conditions. To overcome these challenges, we propose CrossHGL, the first foundational model for cross-domain heterogeneous graphs in such settings. CrossHGL integrates semantically preserved graph homogenization with an innovative Tri-Prompt mechanism that jointly models transferable knowledge at the feature, edge, and structural levels. It further employs prompt-aware multi-domain self-supervised pretraining and a parameter-efficient prompt composition fine-tuning strategy. Experimental results demonstrate that our approach achieves average Micro-F1 improvements of 25.1% and 7.6% on node and graph classification tasks, respectively, while maintaining robust performance under challenging conditions such as feature degradation.

Technology Category

Application Category

📝 Abstract

Heterogeneous graph representation learning (HGRL) is essential for modeling complex systems with diverse node and edge types. However, most existing methods are limited to closed-world settings with shared schemas and feature spaces, hindering cross-domain generalization. While recent graph foundation models improve transferability, they often target homogeneous graphs, rely on domain-specific schemas, or require rich textual attributes. Consequently, text-free and few-shot cross-domain HGRL remains underexplored. To address this, we propose CrossHGL, a foundation framework that preserves and transfers multi-relational structural semantics without external textual supervision. Specifically, a semantic-preserving transformation strategy homogenizes heterogeneous graphs while encoding interaction semantics into edge features. Based on this, a prompt-aware multi-domain pre-training framework with a Tri-Prompt mechanism captures transferable knowledge across feature, edge, and structure perspectives via self-supervised contrastive learning. For target-domain adaptation, we develop a parameter-efficient fine-tuning strategy that freezes the pre-trained backbone and performs few-shot classification via prompt composition and prototypical learning. Experiments on node-level and graph-level tasks show that CrossHGL consistently outperforms state-of-the-art baselines, yielding average relative improvements of 25.1% and 7.6% in Micro-F1 for node and graph classification, respectively, while remaining competitive in challenging feature-degenerated settings.

Problem

Research questions and friction points this paper is trying to address.

heterogeneous graph

cross-domain

text-free

few-shot

graph representation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

heterogeneous graph learning

foundation model

cross-domain transfer