🤖 AI Summary
Graph Transformers (GTs) mitigate over-smoothing in GNNs via global attention but suffer from severe over-smoothing themselves, degrading node representations. To address this, we propose a PageRank-enhanced attention mechanism—the first to theoretically embed PageRank into Transformer architectures—yielding a graph-structure-aware adaptive bandpass filter that overcomes the inherent low-pass limitation of conventional GTs while preserving both global context and hierarchical structural information. Our method comprises four components: PageRank-guided sparse attention, bandpass spectral filtering, structure-aware positional encoding, and a linear-complexity implementation. Extensive experiments across 11 diverse graph datasets—from thousands to millions of nodes—demonstrate consistent and significant improvements over state-of-the-art methods on both node and graph classification tasks. The code is publicly available.
📝 Abstract
Graph Transformers (GTs) have emerged as a promising graph learning tool, leveraging their all-pair connected property to effectively capture global information. To address the over-smoothing problem in deep GNNs, global attention was initially introduced, eliminating the necessity for using deep GNNs. However, through empirical and theoretical analysis, we verify that the introduced global attention exhibits severe over-smoothing, causing node representations to become indistinguishable due to its inherent low-pass filtering. This effect is even stronger than that observed in GNNs. To mitigate this, we propose PageRank Transformer (ParaFormer), which features a PageRank-enhanced attention module designed to mimic the behavior of deep Transformers. We theoretically and empirically demonstrate that ParaFormer mitigates over-smoothing by functioning as an adaptive-pass filter. Experiments show that ParaFormer achieves consistent performance improvements across both node classification and graph classification tasks on 11 datasets ranging from thousands to millions of nodes, validating its efficacy. The supplementary material, including code and appendix, can be found in https://github.com/chaohaoyuan/ParaFormer.