π€ AI Summary
Existing graph neural networks are constrained by their message-passing mechanisms, which struggle to efficiently capture long-range dependencies. This work proposes a linearized graph sequence modeling framework that, for the first time, reformulates information propagation on graphs as a sequence modeling problem. By decoupling computational depth from information propagation depth, the approach reveals the essential sequential properties required to preserve graph inductive biases. The method systematically integrates advanced sequence modeling paradigms with graph-structured priors, achieving substantial performance gains over current models across multiple long-range dependency tasks. These results demonstrate the effectiveness and superiority of leveraging sequence modeling to enhance graph representation learning.
π Abstract
Message-passing based approaches form the default backbone of most learning architectures on graph-structured data. However, the rapid progress of modern deep learning architectures in other domains, particularly sequence modeling, raises the question of how graph learning can benefit from these advances. We introduce Linearized Graph Sequence Models, a framework that recasts message-passing graph computation from the perspective of sequence modeling to simplify architectural choices. Our approach systematically separates the computational processing depth from the information propagation depth, allowing core graph architectural decisions to be treated as sequence modeling choices. Specifically, we analyze, both empirically and theoretically, what sequence properties make methods effective for learning and preserving the graph inductive bias. In particular, we validate our findings, demonstrating improved performance on long-range information tasks in graphs. Our findings provide a principled way to integrate modern sequence modeling advances into message-passing based graph learning. Beyond this, our work demonstrates how the separation of processing and information depth can recast central architectural questions as input modeling choices.