π€ AI Summary
Heterogeneous graph semantic modeling faces three key challenges: strong reliance on predefined relations, scarcity of supervision signals, and a semantic gap between pretraining and fine-tuning. While large language models (LLMs) offer superior semantic understanding, their high computational cost and inherent incompatibility with graph-structured data hinder direct application. To address these issues, we propose ELLAβa novel framework that synergistically integrates LLMs with heterogeneous graph learning. ELLA introduces three core innovations: (1) an LLM-aware relation tokenizer, (2) a hop-level relational graph transformer, and (3) fine-grained, task-aware chain-of-thought prompting via textual instructions. The framework achieves linear inference complexity, supports 13B-parameter LLMs, and outperforms state-of-the-art methods on four benchmark datasets. Moreover, it accelerates inference by up to 4Γ compared to existing LLM-based approaches.
π Abstract
Heterogeneous graphs are widely present in real-world complex networks, where the diversity of node and relation types leads to complex and rich semantics. Efforts for modeling complex relation semantics in heterogeneous graphs are restricted by the limitations of predefined semantic dependencies and the scarcity of supervised signals. The advanced pre-training and fine-tuning paradigm leverages graph structure to provide rich self-supervised signals, but introduces semantic gaps between tasks. Large Language Models (LLMs) offer significant potential to address the semantic issues of relations and tasks in heterogeneous graphs through their strong reasoning capabilities in textual modality, but their incorporation into heterogeneous graphs is largely limited by computational complexity. Therefore, in this paper, we propose an Efficient LLM-Aware (ELLA) framework for heterogeneous graphs, addressing the above issues. To capture complex relation semantics, we propose an LLM-aware Relation Tokenizer that leverages LLM to encode multi-hop, multi-type relations. To reduce computational complexity, we further employ a Hop-level Relation Graph Transformer, which help reduces the complexity of LLM-aware relation reasoning from exponential to linear. To bridge semantic gaps between pre-training and fine-tuning tasks, we introduce the fine-grained task-aware textual Chain-of-Thought (CoT) prompts. Extensive experiments on four heterogeneous graphs show that our proposed ELLA outperforms state-of-the-art methods in the performance and efficiency. In particular, ELLA scales up to 13b-parameter LLMs and achieves up to a 4x speedup compared with existing LLM-based methods. Our code is publicly available at https://github.com/l-wd/ELLA.