🤖 AI Summary
Graph Transformers in node classification tasks often suffer from “distance misalignment”—a mismatch between the model’s communication range and the task’s required receptive field—due to indiscriminate global information mixing, which degrades performance. This work constructs a controllable synthetic benchmark to systematically analyze how Graph Transformers exhibit distance preferences under varying degrees of task locality, introducing the concept of “distance-misaligned training” to uncover the intrinsic relationship between task locality and model distance bias. Building on these insights, the authors propose an adaptive graph-aware controller that dynamically adjusts communication weights across different graph distances. Experiments demonstrate that an oracle adaptive controller significantly outperforms neutral baselines on both local and mixed-locality tasks, closely approaching the performance of optimal fixed biases, whereas task-agnostic controllers yield limited gains, thereby validating the critical role of task-driven control.
📝 Abstract
Graph Transformers can mix information globally, but this flexibility also creates failure modes: some tasks require long-range communication while others are better served by local interaction. We study this through a synthetic node-classification benchmark on contextual stochastic block model graphs, where labels are generated by a controllable mixture of local and far-shell signals. We define distance-misaligned training as a mismatch between where label-relevant information lies and where the model allocates communication over graph distance. On this benchmark, we find three points. First, the preferred graph-distance bias changes systematically with task locality. Second, an oracle adaptive controller, given offline access to the task-side distance target, nearly matches the best fixed bias across regimes and strongly improves over a neutral baseline on mixed and local tasks. Third, a task-agnostic zero-gap controller is weaker, indicating that adaptation alone is not enough and that the control target matters. These results suggest that distance-resolved diagnosis is useful for understanding Graph Transformer failures and for designing graph-aware control.