Towards Efficient LLM-aware Heterogeneous Graph Learning

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Heterogeneous graph semantic modeling faces three key challenges: strong reliance on predefined relations, scarcity of supervision signals, and a semantic gap between pretraining and fine-tuning. While large language models (LLMs) offer superior semantic understanding, their high computational cost and inherent incompatibility with graph-structured data hinder direct application. To address these issues, we propose ELLA—a novel framework that synergistically integrates LLMs with heterogeneous graph learning. ELLA introduces three core innovations: (1) an LLM-aware relation tokenizer, (2) a hop-level relational graph transformer, and (3) fine-grained, task-aware chain-of-thought prompting via textual instructions. The framework achieves linear inference complexity, supports 13B-parameter LLMs, and outperforms state-of-the-art methods on four benchmark datasets. Moreover, it accelerates inference by up to 4× compared to existing LLM-based approaches.

Technology Category

Application Category

📝 Abstract

Heterogeneous graphs are widely present in real-world complex networks, where the diversity of node and relation types leads to complex and rich semantics. Efforts for modeling complex relation semantics in heterogeneous graphs are restricted by the limitations of predefined semantic dependencies and the scarcity of supervised signals. The advanced pre-training and fine-tuning paradigm leverages graph structure to provide rich self-supervised signals, but introduces semantic gaps between tasks. Large Language Models (LLMs) offer significant potential to address the semantic issues of relations and tasks in heterogeneous graphs through their strong reasoning capabilities in textual modality, but their incorporation into heterogeneous graphs is largely limited by computational complexity. Therefore, in this paper, we propose an Efficient LLM-Aware (ELLA) framework for heterogeneous graphs, addressing the above issues. To capture complex relation semantics, we propose an LLM-aware Relation Tokenizer that leverages LLM to encode multi-hop, multi-type relations. To reduce computational complexity, we further employ a Hop-level Relation Graph Transformer, which help reduces the complexity of LLM-aware relation reasoning from exponential to linear. To bridge semantic gaps between pre-training and fine-tuning tasks, we introduce the fine-grained task-aware textual Chain-of-Thought (CoT) prompts. Extensive experiments on four heterogeneous graphs show that our proposed ELLA outperforms state-of-the-art methods in the performance and efficiency. In particular, ELLA scales up to 13b-parameter LLMs and achieves up to a 4x speedup compared with existing LLM-based methods. Our code is publicly available at https://github.com/l-wd/ELLA.

Problem

Research questions and friction points this paper is trying to address.

Modeling complex relation semantics in heterogeneous graphs with limited dependencies

Reducing computational complexity of LLM integration in graph learning

Bridging semantic gaps between pre-training and fine-tuning tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-aware Relation Tokenizer encodes multi-hop relations

Hop-level Relation Graph Transformer reduces complexity to linear

Fine-grained task-aware CoT prompts bridge semantic gaps

🔎 Similar Papers

Exploring the Potential of Large Language Models for Heterophilic Graphs