🤖 AI Summary
This work addresses the limitation of conventional Transformer softmax attention in effectively disentangling multi-scale temporal structures—such as global trends, local shocks, and seasonality—in time series data. To this end, the authors propose CAPS, a structured attention mechanism that explicitly models distinct temporal patterns within a single layer by integrating three additive pathways: Riemann softmax, prefix-product gating, and a Clock baseline, while leveraging SO(2) rotations for phase alignment. A novel shared Clock mechanism dynamically modulates information across these pathways using time-aware importance weights, uniquely unifying attention, recurrent structures, and temporal alignment within a single framework. The method achieves superior performance over standard softmax and linear attention on both short- and long-horizon forecasting benchmarks, matching or exceeding seven strong baselines while maintaining linear computational complexity.
📝 Abstract
This paper presents $\textbf{CAPS}$ (Clock-weighted Aggregation with Prefix-products and Softmax), a structured attention mechanism for time series forecasting that decouples three distinct temporal structures: global trends, local shocks, and seasonal patterns. Standard softmax attention entangles these through global normalization, while recent recurrent models sacrifice long-term, order-independent selection for order-dependent causal structure. CAPS combines SO(2) rotations for phase alignment with three additive gating paths -- Riemann softmax, prefix-product gates, and a Clock baseline -- within a single attention layer. We introduce the Clock mechanism, a learned temporal weighting that modulates these paths through a shared notion of temporal importance. Experiments on long- and short-term forecasting benchmarks surpass vanilla softmax and linear attention mechanisms and demonstrate competitive performance against seven strong baselines with linear complexity. Our code implementation is available at https://github.com/vireshpati/CAPS-Attention.