🤖 AI Summary
This work addresses the limitations of existing linear models in explicitly capturing complex long-term dependencies in time series and the high computational cost of Transformer-based approaches. To this end, we propose LSINet, a lightweight sparse interaction network that introduces a multi-head sparse interaction mechanism (MSIM) within a pure MLP architecture. MSIM employs Bernoulli-distributed adaptive sparsity to dynamically establish connections, enhanced by shared interaction learning (SIL) and an adaptive regularization loss to improve both efficiency and convergence. Extensive experiments demonstrate that LSINet significantly outperforms state-of-the-art linear models and Transformer variants across multiple public datasets, achieving leading performance in both forecasting accuracy and computational efficiency.
📝 Abstract
Recent work shows that linear models can outperform several transformer models in long-term time-series forecasting (TSF). However, instead of explicitly performing temporal interaction through self-attention, linear models implicitly perform it based on stacked MLP structures, which may be insufficient in capturing the complex temporal dependencies and their performance still has potential for improvement. To this end, we propose a Lightweight Sparse Interaction Network (LSINet) for TSF task. Inspired by the sparsity of self-attention, we propose a Multihead Sparse Interaction
Mechanism (MSIM). Different from self-attention, MSIM learns the important connections between time steps through sparsity-induced Bernoulli distribution to capture temporal dependencies for TSF. The sparsity is ensured by the proposed self-adaptive regularization loss. Moreover, we observe the shareability of temporal interactions and propose to perform Shared Interactions Learning (SIL) for MSIM to further enhance efficiency and improve convergence. LSINet is a linear model comprising only MLP structures with low overhead and equipped with explicit temporal interaction mechanisms. Extensive experiments on public datasets show that LSINet achieves both higher accuracy and better efficiency than advanced linear models and transformer models in TSF tasks.