From Text to Time? Rethinking the Effectiveness of the Large Language Model for Time Series Forecasting

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work identifies a critical evaluation bias in current LLMs for time-series forecasting: severe overfitting to specific datasets under small-sample fine-tuning, which obscures their genuine temporal modeling capability. To address this, the authors construct three LLM backbones with identical architecture but distinct pretraining objectives—autoregressive modeling, masked language modeling, and contrastive learning—and decouple encoder and decoder during large-scale pretraining. This enables the first controlled zero-shot and few-shot time-series forecasting evaluation paradigm. Experiments systematically demonstrate, for the first time, that LLMs possess fundamental temporal awareness yet achieve only limited forecasting accuracy. The study further releases open-source code and a standardized benchmark, providing both a reproducible foundation and methodological caution for future research on LLM-based time-series forecasting.

Technology Category

Application Category

📝 Abstract

Using pre-trained large language models (LLMs) as the backbone for time series prediction has recently gained significant research interest. However, the effectiveness of LLM backbones in this domain remains a topic of debate. Based on thorough empirical analyses, we observe that training and testing LLM-based models on small datasets often leads to the Encoder and Decoder becoming overly adapted to the dataset, thereby obscuring the true predictive capabilities of the LLM backbone. To investigate the genuine potential of LLMs in time series prediction, we introduce three pre-training models with identical architectures but different pre-training strategies. Thereby, large-scale pre-training allows us to create unbiased Encoder and Decoder components tailored to the LLM backbone. Through controlled experiments, we evaluate the zero-shot and few-shot prediction performance of the LLM, offering insights into its capabilities. Extensive experiments reveal that although the LLM backbone demonstrates some promise, its forecasting performance is limited. Our source code is publicly available in the anonymous repository: https://anonymous.4open.science/r/LLM4TS-0B5C.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM effectiveness in time series forecasting

Assessing dataset bias in LLM-based prediction models

Exploring LLM potential via pre-training strategy comparisons

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilize pre-trained LLMs for time series forecasting

Introduce three models with varied pre-training strategies

Evaluate zero-shot and few-shot prediction performance

🔎 Similar Papers

No similar papers found.