🤖 AI Summary
This work investigates whether linear Transformers can express classical graph algorithms—specifically, electrical flow computation and Laplacian eigenvector decomposition—under the restricted setting of observing only the graph’s adjacency matrix. Methodologically, we provide the first theoretical proof that linear Transformers can exactly simulate electrical flow algorithms via explicit weight construction; we further introduce an interpretable framework based on error propagation analysis to uncover their implicit message-passing dynamics. Additionally, we propose a learnable positional encoding scheme, which we prove theoretically—and verify empirically—to outperform conventional Laplacian eigenvector-based encodings. We validate algorithmic correctness on synthetic graphs and demonstrate substantial performance gains on real-world molecular property regression tasks. All code is publicly released.
📝 Abstract
We show theoretically and empirically that the linear Transformer, when applied to graph data, can implement algorithms that solve canonical problems such as electric flow and eigenvector decomposition. The Transformer has access to information on the input graph only via the graph's incidence matrix. We present explicit weight configurations for implementing each algorithm, and we bound the constructed Transformers' errors by the errors of the underlying algorithms. Our theoretical findings are corroborated by experiments on synthetic data. Additionally, on a real-world molecular regression task, we observe that the linear Transformer is capable of learning a more effective positional encoding than the default one based on Laplacian eigenvectors. Our work is an initial step towards elucidating the inner-workings of the Transformer for graph data. Code is available at https://github.com/chengxiang/LinearGraphTransformer