🤖 AI Summary
This work addresses the challenge of zero-shot inference in unsupervised text-rich networks (TRNs) by proposing TRN-R1-Zero, a novel framework that achieves zero-shot generalization on edge-level and graph-level tasks solely through node-level reinforcement learning—without requiring supervised fine-tuning or chain-of-thought data generated by large language models. The approach integrates a neighbor-aware group relative policy optimization strategy with a new neighborhood information gain reward mechanism, dynamically guiding large language models to jointly leverage textual semantics and graph structure for relational reasoning. Evaluated across diverse TRN benchmarks—including citation, hyperlink, social, and co-purchase networks—the model demonstrates robust and superior cross-task and cross-domain zero-shot reasoning capabilities.
📝 Abstract
Zero-shot reasoning on text-rich networks (TRNs) remains a challenging frontier, as models must integrate textual semantics with relational structure without task-specific supervision. While graph neural networks rely on fixed label spaces and supervised objectives, recent large language model (LLM)-based approaches often overlook graph context or depend on distillation from larger models, limiting generalisation. We propose TRN-R1-Zero, a post-training framework for TRN reasoning trained solely via reinforcement learning. TRN-R1-Zero directly optimises base LLMs using a Neighbour-aware Group Relative Policy Optimisation objective that dynamically adjusts rewards based on a novel margin gain metric for the informativeness of neighbouring signals, effectively guiding the model toward relational reasoning. Unlike prior methods, TRN-R1-Zero requires no supervised fine-tuning or chain-of-thought data generated from large reasoning models. Extensive experiments across citation, hyperlink, social and co-purchase TRN benchmarks demonstrate the superiority and robustness of TRN-R1-Zero. Moreover, relying strictly on node-level training, TRN-R1-Zero achieves zero-shot inference on edge- and graph-level tasks, extending beyond cross-domain transfer. The codebase is publicly available at https://github.com/superallen13/TRN-R1-Zero.