🤖 AI Summary
Long-term traffic forecasting faces dual challenges: difficulty modeling high-frequency phenomena (e.g., traffic shocks, congestion boundaries) and rapid error accumulation in multi-step rollout—primarily due to excessive smoothness in outputs of existing neural operators, which fail to capture sharp features such as density gradients. To address this, we propose DiffNO: the first unified framework integrating Transformer-based neural operators with diffusion mechanisms. It models functional mappings via cross-attention and introduces a progressive denoising refinement module to explicitly recover high-frequency traffic structures. Innovatively embedding super-resolution modeling and iterative denoising into an end-to-end architecture, DiffNO achieves significant improvements on chaotic traffic datasets: 23.6% reduction in long-term prediction error and 31.4% improvement in density gradient fidelity. This enhances modeling capability for shock propagation and congestion evolution, outperforming state-of-the-art neural operator methods.
📝 Abstract
Accurate long-term traffic forecasting remains a critical challenge in intelligent transportation systems, particularly when predicting high-frequency traffic phenomena such as shock waves and congestion boundaries over extended rollout horizons. Neural operators have recently gained attention as promising tools for modeling traffic flow. While effective at learning function space mappings, they inherently produce smooth predictions that fail to reconstruct high-frequency features such as sharp density gradients which results in rapid error accumulation during multi-step rollout predictions essential for real-time traffic management. To address these fundamental limitations, we introduce a unified Diffusion-Enhanced Transformer Neural Operator (DETNO) architecture. DETNO leverages a transformer neural operator with cross-attention mechanisms, providing model expressivity and super-resolution, coupled with a diffusion-based refinement component that iteratively reconstructs high-frequency traffic details through progressive denoising. This overcomes the inherent smoothing limitations and rollout instability of standard neural operators. Through comprehensive evaluation on chaotic traffic datasets, our method demonstrates superior performance in extended rollout predictions compared to traditional and transformer-based neural operators, preserving high-frequency components and improving stability over long prediction horizons.