🤖 AI Summary
Existing zero-shot generalization methods for graph tasks suffer from a disconnect between graph structure modeling and semantic understanding, and heavily rely on task-specific supervision. Method: We propose GraphLLM, a unified graph–text joint encoding framework. It employs an instruction-tuned encoder–decoder architecture featuring a structure-aware graph–text cross-attention mechanism and learnable alignment tokens to enable end-to-end alignment between graph topology and natural language instructions. The decoder is a pretrained autoregressive large language model, while the encoder integrates a graph neural network; reconstruction-based regularization replaces downstream fine-tuning entirely. Contribution/Results: GraphLLM achieves state-of-the-art zero-shot performance—without any label supervision—across diverse tasks including node classification, link prediction, graph classification, and graph regression. It significantly enhances cross-task and cross-level generalization, marking the first such demonstration in graph learning.
📝 Abstract
Generalizing to unseen graph tasks without task-specific supervision is challenging: conventional graph neural networks are typically tied to a fixed label space, while large language models (LLMs) struggle to capture graph structure. We introduce UniGTE, an instruction-tuned encoder-decoder framework that unifies structural and semantic reasoning. The encoder augments a pretrained autoregressive LLM with learnable alignment tokens and a structure-aware graph-text attention mechanism, enabling it to attend jointly to a tokenized graph and a natural-language task prompt while remaining permutation-invariant to node order. This yields compact, task-aware graph representations. Conditioned solely on these representations, a frozen LLM decoder predicts and reconstructs: it outputs the task answer and simultaneously paraphrases the input graph in natural language. The reconstruction objective regularizes the encoder to preserve structural cues. UniGTE is instruction-tuned on five datasets spanning node-level, edge-level, and graph-level tasks across diverse domains, yet requires no fine-tuning at inference. It achieves new state-of-the-art zero-shot results on node classification, link prediction, graph classification, and graph regression under cross-task and cross-domain settings, demonstrating that tight integration of graph structure with LLM semantics enables robust, transferable graph reasoning.