π€ AI Summary
Existing attention mechanisms for time-series forecasting (TSF) suffer from insufficient local modeling, weak capture of periodic patterns, and inter-variable redundancy. To address these limitations, we propose PeriodNetβa novel Transformer-based architecture featuring two innovations: (i) period-aware attention and sparse period-aware attention, which explicitly model multi-scale periodicity while jointly enhancing local feature extraction and global dependency learning; and (ii) an iterative grouping strategy to suppress inter-variable redundancy, coupled with a period diffuser for high-accuracy multi-period forecasting. All components are tightly integrated into an enhanced encoder to strengthen representation learning. Extensive experiments across eight benchmark datasets demonstrate that PeriodNet consistently outperforms six state-of-the-art models on both univariate and multivariate TSF tasks. Notably, it achieves a 22% average reduction in MAE for long-horizon predictions up to 720 steps.
π Abstract
The attention mechanism has demonstrated remarkable potential in sequence modeling, exemplified by its successful application in natural language processing with models such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT). Despite these advancements, its utilization in time series forecasting (TSF) has yet to meet expectations. Exploring a better network structure for attention in TSF holds immense significance across various domains. In this paper, we present PeriodNet with a brand new structure to forecast univariate and multivariate time series. PeriodNet incorporates period attention and sparse period attention mechanism for analyzing adjacent periods. It enhances the mining of local characteristics, periodic patterns, and global dependencies. For efficient cross-variable modeling, we introduce an iterative grouping mechanism which can directly reduce the cross-variable redundancy. To fully leverage the extracted features on the encoder side, we redesign the entire architecture of the vanilla Transformer and propose a period diffuser for precise multi-period prediction. Through comprehensive experiments conducted on eight datasets, we demonstrate that PeriodNet outperforms six state-of-the-art models in both univariate and multivariate TSF scenarios in terms of mean square error and mean absolute error. In particular, PeriodNet achieves a relative improvement of 22% when forecasting time series with a length of 720, in comparison to other models based on the conventional encoder-decoder Transformer architecture.