🤖 AI Summary
Existing LLM-GNN fusion methods for node classification on text-attributed graphs employ a global, uniform strategy, overlooking node heterogeneity—leading to marginal performance gains and lacking fine-grained rationale for LLM invocation. This work proposes a node-aware GNN-LLM fusion framework that introduces, for the first time, a lightweight, local-structure-aware node-level routing mechanism. It employs advantage-based reinforcement learning to make non-differentiable decisions on LLM invocation, dynamically activating the LLM only at nodes where GNN predictions are weak. Evaluated on multiple benchmark datasets, our method achieves substantial improvements in heterophilous node classification accuracy—up to +13%—while maintaining overall optimality and low computational overhead. The framework establishes a novel, interpretable, and adaptive paradigm for large-model collaboration in graph learning.
📝 Abstract
Learning on text-attributed graphs has motivated the use of Large Language Models (LLMs) for graph learning. However, most fusion strategies are applied uniformly across all nodes and attain only small overall performance gains. We argue this result stems from aggregate metrics that obscure when LLMs provide benefit, inhibiting actionable signals for new strategies. In this work, we reframe LLM-GNN fusion around nodes where GNNs typically falter. We first show that performance can significantly differ between GNNs and LLMs, with each excelling on distinct structural patterns, such as local homophily. To leverage this finding, we propose GLANCE (GNN with LLM Assistance for Neighbor- and Context-aware Embeddings), a framework that invokes an LLM to refine a GNN's prediction. GLANCE employs a lightweight router that, given inexpensive per-node signals, decides whether to query the LLM. Since the LLM calls are non-differentiable, the router is trained with an advantage-based objective that compares the utility of querying the LLM against relying solely on the GNN. Across multiple benchmarks, GLANCE achieves the best performance balance across node subgroups, achieving significant gains on heterophilous nodes (up to $+13%$) while simultaneously achieving top overall performance. Our findings highlight the value of adaptive, node-aware GNN-LLM architectures, where selectively invoking the LLM enables scalable deployment on large graphs without incurring high computational costs.