Fusing Large Language Models with Temporal Transformers for Time Series Forecasting

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses two key limitations in time-series forecasting: the insufficient temporal modeling capability of large language models (LLMs) and the lack of high-level semantic understanding in conventional Transformers. To bridge this gap, we propose a novel collaborative fusion architecture integrating LLMs and time-series Transformers. Methodologically, we design a dual-stream encoder: one stream leverages prompt learning to guide the LLM in extracting sequence-level semantic representations, while the other employs a time-series Transformer to capture dynamic temporal dependencies. These streams undergo learnable cross-modal representation fusion at intermediate layers and are jointly optimized end-to-end. Our key contribution is the first explicit, differentiable, and structurally grounded integration of LLM-derived semantic priors with time-series dynamics. Extensive experiments on multiple benchmark datasets demonstrate that our method significantly outperforms both pure-LLM baselines and state-of-the-art time-series models (e.g., Informer, Autoformer), achieving average MAE reductions of 12.3%–18.7%.

Technology Category

Application Category

📝 Abstract
Recently, large language models (LLMs) have demonstrated powerful capabilities in performing various tasks and thus are applied by recent studies to time series forecasting (TSF) tasks, which predict future values with the given historical time series. Existing LLM-based approaches transfer knowledge learned from text data to time series prediction using prompting or fine-tuning strategies. However, LLMs are proficient at reasoning over discrete tokens and semantic patterns but are not initially designed to model continuous numerical time series data. The gaps between text and time series data lead LLMs to achieve inferior performance to a vanilla Transformer model that is directly trained on TSF data. However, the vanilla Transformers often struggle to learn high-level semantic patterns. In this paper, we design a novel Transformer-based architecture that complementarily leverages LLMs and vanilla Transformers, so as to integrate the high-level semantic representations learned by LLMs into the temporal information encoded by time series Transformers, where a hybrid representation is obtained by fusing the representations from the LLM and the Transformer. The resulting fused representation contains both historical temporal dynamics and semantic variation patterns, allowing our model to predict more accurate future values. Experiments on benchmark datasets demonstrate the effectiveness of the proposed approach.
Problem

Research questions and friction points this paper is trying to address.

Bridging LLMs and time series data gaps
Enhancing temporal and semantic pattern integration
Improving time series forecasting accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses LLMs with temporal Transformers
Integrates semantic and temporal representations
Hybrid model improves forecasting accuracy
🔎 Similar Papers
No similar papers found.
Chen Su
Chen Su
PhD candidate in College of Optical Science and Engineering, Zhejiang University
3D displayHCI
Yuanhe Tian
Yuanhe Tian
University of Washington
Computational LinguisticsNatural Language Processing
Q
Qinyu Liu
Beijing Northern Computility InterConnection Co., Ltd.
J
Jun Zhang
ENN Group Co., Ltd.
Y
Yan Song
University of Science and Technology of China