🤖 AI Summary
To address few-shot node classification on Text-Attributed Graphs (TAGs), this paper proposes a preference-driven knowledge distillation framework that synergistically integrates Large Language Models (LLMs) and heterogeneous Graph Neural Networks (GNNs). Methodologically, it introduces dual preference mechanisms—GNN-level and node-level—to dynamically select the optimal teacher model per node; it then performs hierarchical knowledge distillation by jointly leveraging LLMs’ few-shot semantic reasoning and GNNs’ topological message passing. The key innovation lies in the first incorporation of node-level preferences into graph knowledge distillation, enabling fine-grained, topology-aware teacher selection and knowledge transfer. Extensive experiments on multiple real-world TAG benchmarks demonstrate that the framework significantly improves few-shot classification accuracy, achieving an average gain of 5.2% over state-of-the-art baselines.
📝 Abstract
Graph neural networks (GNNs) can efficiently process text-attributed graphs (TAGs) due to their message-passing mechanisms, but their training heavily relies on the human-annotated labels. Moreover, the complex and diverse local topologies of nodes of real-world TAGs make it challenging for a single mechanism to handle. Large language models (LLMs) perform well in zero-/few-shot learning on TAGs but suffer from a scalability challenge. Therefore, we propose a preference-driven knowledge distillation (PKD) framework to synergize the complementary strengths of LLMs and various GNNs for few-shot node classification. Specifically, we develop a GNN-preference-driven node selector that effectively promotes prediction distillation from LLMs to teacher GNNs. To further tackle nodes' intricate local topologies, we develop a node-preference-driven GNN selector that identifies the most suitable teacher GNN for each node, thereby facilitating tailored knowledge distillation from teacher GNNs to the student GNN. Extensive experiments validate the efficacy of our proposed framework in few-shot node classification on real-world TAGs.