Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing time series forecasting methods, which predominantly focus on numerical data and struggle to effectively incorporate accompanying textual information, thereby constraining their ability to model complex real-world scenarios. To overcome this, the authors propose a multimodal fusion framework that leverages event-driven reasoning and historical context learning to guide large language models in semantic inference. The framework introduces an endogenous text alignment mechanism and an adaptive frequency-domain fusion strategy to achieve deep integration of textual information at both representation and prediction levels. Extensive experiments across ten real-world datasets spanning diverse domains demonstrate that the proposed method significantly outperforms current state-of-the-art approaches, confirming that effective utilization of textual cues can substantially enhance time series forecasting performance.

Technology Category

Application Category

📝 Abstract
Existing time series forecasting methods primarily rely on the numerical data itself. However, real-world time series exhibit complex patterns associated with multimodal information, making them difficult to predict with numerical data alone. While several multimodal time series forecasting methods have emerged, they either utilize text with limited supplementary information or focus merely on representation extraction, extracting minimal textual information for forecasting. To unlock the Value of Text, we propose VoT, a method with Event-driven Reasoning and Multi-level Alignment. Event-driven Reasoning combines the rich information in exogenous text with the powerful reasoning capabilities of LLMs for time series forecasting. To guide the LLMs in effective reasoning, we propose the Historical In-context Learning that retrieves and applies historical examples as in-context guidance. To maximize the utilization of text, we propose Multi-level Alignment. At the representation level, we utilize the Endogenous Text Alignment to integrate the endogenous text information with the time series. At the prediction level, we design the Adaptive Frequency Fusion to fuse the frequency components of event-driven prediction and numerical prediction to achieve complementary advantages. Experiments on real-world datasets across 10 domains demonstrate significant improvements over existing methods, validating the effectiveness of our approach in the utilization of text. The code is made available at https://github.com/decisionintelligence/VoT.
Problem

Research questions and friction points this paper is trying to address.

time series forecasting
multimodal
text utilization
event-driven reasoning
multi-level alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Event-driven Reasoning
Multi-level Alignment
Historical In-context Learning
Adaptive Frequency Fusion
Multimodal Time Series Forecasting
🔎 Similar Papers
No similar papers found.