🤖 AI Summary
This work addresses the challenge of jointly modeling paired textual and numerical modalities in multivariate time series. We first observe—empirically and theoretically—that real-world time-stamped textual annotations exhibit intrinsic periodicity aligned with the underlying numerical time series. Leveraging this insight, we propose Texts as Time Series (TaTS), a novel framework that treats auxiliary text as an additional time series. TaTS introduces three key components: (i) periodicity-aware alignment to synchronize textual and numerical dynamics; (ii) time-series-aware textual embedding, converting token-level representations into temporally structured encodings; and (iii) a lightweight cross-modal coupling module enabling plug-and-play integration without modifying any core time-series model architecture. Evaluated across multiple benchmark datasets on forecasting and missing-value imputation tasks, TaTS consistently delivers significant performance gains. Crucially, it is model-agnostic—compatible with diverse state-of-the-art time-series models—thereby establishing a scalable, low-intrusion paradigm for multimodal time-series modeling.
📝 Abstract
While many advances in time series models focus exclusively on numerical data, research on multimodal time series, particularly those involving contextual textual information commonly encountered in real-world scenarios, remains in its infancy. Consequently, effectively integrating the text modality remains challenging. In this work, we highlight an intuitive yet significant observation that has been overlooked by existing works: time-series-paired texts exhibit periodic properties that closely mirror those of the original time series. Building on this insight, we propose a novel framework, Texts as Time Series (TaTS), which considers the time-series-paired texts to be auxiliary variables of the time series. TaTS can be plugged into any existing numerical-only time series models and enable them to handle time series data with paired texts effectively. Through extensive experiments on both multimodal time series forecasting and imputation tasks across benchmark datasets with various existing time series models, we demonstrate that TaTS can enhance predictive performance and achieve outperformance without modifying model architectures.