🤖 AI Summary
Existing positional encodings for dynamic graph temporal link prediction suffer from three key limitations: (i) predefined functional forms fail to capture complex attributed graph structures; (ii) learnable encodings neglect the joint temporal evolution of topology and node features; and (iii) Transformer-based attention incurs prohibitive computational overhead on large-scale graphs. Method: We propose ST-PE, the first learnable spatiotemporal positional encoding that jointly models structural dynamics and node attribute evolution. ST-PE is theoretically grounded in spectral graph theory—guaranteeing preservation of spectral properties—and integrates spatiotemporal spectral analysis, multi-strategy sampling (random, temporal, neighborhood), and complexity-aware design. Contribution/Results: We empirically demonstrate that a lightweight MLP equipped with ST-PE matches or exceeds Transformer performance. Evaluated across 13 benchmarks—including the large-scale Temporal Graph Benchmark (TGB)—ST-PE achieves state-of-the-art results in both accuracy and inference speed, outperforming ten competitive baselines.
📝 Abstract
Accurate predictions rely on the expressiveness power of graph deep learning frameworks like graph neural networks and graph transformers, where a positional encoding mechanism has become much more indispensable in recent state-of-the-art works to record the canonical position information. However, the current positional encoding is limited in three aspects: (1) most positional encoding methods use pre-defined, and fixed functions, which are inadequate to adapt to the complex attributed graphs; (2) a few pioneering works proposed the learnable positional encoding but are still limited to the structural information, not considering the real-world time-evolving topological and feature information; (3) most positional encoding methods are equipped with transformers' attention mechanism to fully leverage their capabilities, where the dense or relational attention is often unaffordable on large-scale structured data. Hence, we aim to develop Learnable Spatial-Temporal Positional Encoding in an effective and efficient manner and propose a simple temporal link prediction model named L-STEP. Briefly, for L-STEP, we (1) prove the proposed positional learning scheme can preserve the graph property from the spatial-temporal spectral viewpoint, (2) verify that MLPs can fully exploit the expressiveness and reach transformers' performance on that encoding, (3) change different initial positional encoding inputs to show robustness, (4) analyze the theoretical complexity and obtain less empirical running time than SOTA, and (5) demonstrate its temporal link prediction out-performance on 13 classic datasets and with 10 algorithms in both transductive and inductive settings using 3 different sampling strategies. Also,
ame obtains the leading performance in the newest large-scale TGB benchmark. Our code is available at https://github.com/kthrn22/L-STEP.