Dual-Forecaster: A Multimodal Time Series Model Integrating Descriptive and Predictive Texts

📅 2025-05-02
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing unimodal time series models rely solely on numerical data, suffering from semantic sparsity; while multimodal approaches incorporate textual information, they typically leverage only unidirectional text—either historical or future—and lack fine-grained modeling of text–time semantics, temporal dynamics, and causal relationships. To address these limitations, we propose a bidirectional text-driven forecasting paradigm that jointly integrates descriptive historical text and predictive future text for the first time. We design a three-stage cross-modal alignment module—encompassing semantic, temporal, and causal alignment—leveraging a large language model for text encoding and a dedicated time series feature extractor. Extensive experiments across 15 multivariate time series benchmarks demonstrate that our method consistently matches or surpasses state-of-the-art approaches, validating the substantial performance gains enabled by bidirectional textual integration.

Technology Category

Application Category

📝 Abstract
Most existing single-modal time series models rely solely on numerical series, which suffer from the limitations imposed by insufficient information. Recent studies have revealed that multimodal models can address the core issue by integrating textual information. However, these models focus on either historical or future textual information, overlooking the unique contributions each plays in time series forecasting. Besides, these models fail to grasp the intricate relationships between textual and time series data, constrained by their moderate capacity for multimodal comprehension. To tackle these challenges, we propose Dual-Forecaster, a pioneering multimodal time series model that combines both descriptively historical textual information and predictive textual insights, leveraging advanced multimodal comprehension capability empowered by three well-designed cross-modality alignment techniques. Our comprehensive evaluations on fifteen multimodal time series datasets demonstrate that Dual-Forecaster is a distinctly effective multimodal time series model that outperforms or is comparable to other state-of-the-art models, highlighting the superiority of integrating textual information for time series forecasting. This work opens new avenues in the integration of textual information with numerical time series data for multimodal time series analysis.
Problem

Research questions and friction points this paper is trying to address.

Existing models lack sufficient information from single-modal numerical data
Current multimodal models ignore historical and future text contributions
Models struggle with complex text-time series relationships due to limited comprehension
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines historical and predictive textual insights
Uses three cross-modality alignment techniques
Leverages advanced multimodal comprehension capability
🔎 Similar Papers
No similar papers found.