Local Attention Mechanism: Boosting the Transformer Architecture for Long-Sequence Time Series Forecasting

📅 2024-10-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational complexity (O(n²)) of Transformers and the lack of standardized evaluation benchmarks for long-horizon time series forecasting, this paper proposes a Local Attention Mechanism (LAM) that achieves O(n log n) attention computation—marking the first such efficiency gain—by explicitly exploiting the local continuity inherent in time series. We further design a lightweight implementation based on tensor algebra and adapt the Transformer architecture for long-sequence modeling. Additionally, we introduce LongTS, the first standardized benchmark dataset specifically curated for long-horizon time series forecasting. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art approaches across multiple long-horizon forecasting tasks: it improves prediction accuracy by 3.2%–8.7%, accelerates inference by 2.1×, and reduces memory consumption by 64%.

Technology Category

Application Category

📝 Abstract
Transformers have become the leading choice in natural language processing over other deep learning architectures. This trend has also permeated the field of time series analysis, especially for long-horizon forecasting, showcasing promising results both in performance and running time. In this paper, we introduce Local Attention Mechanism (LAM), an efficient attention mechanism tailored for time series analysis. This mechanism exploits the continuity properties of time series to reduce the number of attention scores computed. We present an algorithm for implementing LAM in tensor algebra that runs in time and memory O(nlogn), significantly improving upon the O(n^2) time and memory complexity of traditional attention mechanisms. We also note the lack of proper datasets to evaluate long-horizon forecast models. Thus, we propose a novel set of datasets to improve the evaluation of models addressing long-horizon forecasting challenges. Our experimental analysis demonstrates that the vanilla transformer architecture magnified with LAM surpasses state-of-the-art models, including the vanilla attention mechanism. These results confirm the effectiveness of our approach and highlight a range of future challenges in long-sequence time series forecasting.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational complexity in time series attention mechanisms
Improving long-horizon forecasting accuracy with local attention
Addressing dataset scarcity for long-sequence forecasting evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Local Attention Mechanism reduces attention scores
O(nlogn) algorithm improves time and memory
New datasets enhance long-horizon forecasting evaluation
🔎 Similar Papers
No similar papers found.
I
Ignacio Aguilera-Martos
Andalusian Institute of Data Science and Computational Intelligence (DaSCI), University of Granada, Spain
Andrés Herrera-Poyatos
Andrés Herrera-Poyatos
Lecturer at University of Granada, deparment of Algebra. PhD from the University of Oxford
Randomised algorithmsComputational ComplexityCombinatoricsDeep Learning
J
Julián Luengo
Andalusian Institute of Data Science and Computational Intelligence (DaSCI), University of Granada, Spain
Francisco Herrera
Francisco Herrera
Professor Computer Science and AI, DaSCI Research Institute, Granada University, Spain
Artificial IntelligenceComputational IntelligenceData ScienceTrustworthy AI