FAST: A Synergistic Framework of Attention and State-space Models for Spatiotemporal Traffic Prediction

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

264K/year

🤖 AI Summary

Accurate traffic forecasting requires simultaneously modeling complex temporal dynamics and long-range spatial dependencies across road networks, yet existing methods struggle to balance representational capacity with computational efficiency. To address this challenge, this work proposes the FAST framework, which innovatively integrates Mamba state space models with attention mechanisms within a time–space–time architecture. By leveraging learnable multi-source spatiotemporal embeddings and a multi-level skip prediction mechanism, FAST efficiently captures both short- and long-term temporal patterns as well as long-range spatial correlations among sensors. The method achieves significant improvements in prediction accuracy while maintaining linear computational complexity. Experimental results on the PeMS04, PeMS07, and PeMS08 datasets demonstrate that FAST consistently outperforms state-of-the-art baselines, reducing MAE and RMSE by up to 2.8% and 4.3%, respectively.

Technology Category

Application Category

📝 Abstract

Traffic forecasting requires modeling complex temporal dynamics and long-range spatial dependencies over large sensor networks. Existing methods typically face a trade-off between expressiveness and efficiency: Transformer-based models capture global dependencies well but suffer from quadratic complexity, while recent selective state-space models are computationally efficient yet less effective at modeling spatial interactions in graph-structured traffic data. We propose FAST, a unified framework that combines attention and state-space modeling for scalable spatiotemporal traffic forecasting. FAST adopts a Temporal-Spatial-Temporal architecture, where temporal attention modules capture both short- and long-term temporal patterns, and a Mamba-based spatial module models long-range inter-sensor dependencies with linear complexity. To better represent heterogeneous traffic contexts, FAST further introduces a learnable multi-source spatiotemporal embedding that integrates historical traffic flow, temporal context, and node-level information, together with a multi-level skip prediction mechanism for hierarchical feature fusion. Experiments on PeMS04, PeMS07, and PeMS08 show that FAST consistently outperforms strong baselines from Transformer-, GNN-, attention-, and Mamba-based families. In particular, FAST achieves the best MAE and RMSE on all three benchmarks, with up to 4.3\% lower RMSE and 2.8\% lower MAE than the strongest baseline, demonstrating a favorable balance between accuracy, scalability, and generalization.

Problem

Research questions and friction points this paper is trying to address.

spatiotemporal traffic prediction

attention mechanism

state-space models

scalability

graph-structured data

Innovation

Methods, ideas, or system contributions that make the work stand out.

state-space model

attention mechanism

spatiotemporal forecasting