🤖 AI Summary
Existing time-series analysis models are restricted to numeric modalities and struggle to incorporate domain-specific textual knowledge, resulting in limited modeling capacity and a lack of high-quality multimodal benchmarks. To address this, we introduce Time-MMD—the first large-scale multimodal time-series dataset spanning nine diverse domains—featuring fine-grained semantic alignment between numeric sequences and domain-specific textual descriptions. We further propose a multi-domain multimodal time-series benchmark that systematically tackles modality contamination and the absence of cross-modal alignment. Additionally, we open-source MM-TSFlib, the first modular library for multimodal time-series forecasting, enabling joint modeling and granular evaluation. Experiments demonstrate that our approach reduces average MSE by over 15% on multimodal forecasting tasks, with gains reaching 40% in text-rich scenarios. This work advances time-series analysis from unimodal paradigms toward human-AI collaborative multimodal reasoning. Code and data are publicly available.
📝 Abstract
Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of textual series data and the absence of a comprehensive, high-quality multimodal dataset. To overcome this obstacle, we introduce Time-MMD, the first multi-domain, multimodal time series dataset covering 9 primary data domains. Time-MMD ensures fine-grained modality alignment, eliminates data contamination, and provides high usability. Additionally, we develop MM-TSFlib, the first-cut multimodal time-series forecasting (TSF) library, seamlessly pipelining multimodal TSF evaluations based on Time-MMD for in-depth analyses. Extensive experiments conducted on Time-MMD through MM-TSFlib demonstrate significant performance enhancements by extending unimodal TSF to multimodality, evidenced by over 15% mean squared error reduction in general, and up to 40% in domains with rich textual data. More importantly, our datasets and library revolutionize broader applications, impacts, research topics to advance TSA. The dataset is available at https://github.com/AdityaLab/Time-MMD.