🤖 AI Summary
Traditional multivariate time series modeling often neglects the semantic information embedded in variable names and textual descriptions, resulting in limited model interpretability and poor generalization. To address this, we propose a semantics-enhanced dual-modality modeling framework. First, a large language model is employed to parse variable-level textual descriptions and construct a structured multivariate knowledge graph. Second, a dual-modality encoder is designed to jointly model semantic prompts and numerical time-series patterns via cross-modal attention, augmented with causal priors to support causal reasoning. This work represents the first systematic integration of fine-grained, variable-level semantic knowledge into the time series modeling pipeline. Extensive experiments on multiple benchmark datasets demonstrate significant improvements in both forecasting and classification performance, alongside enhanced model interpretability and cross-domain generalization capability.
📝 Abstract
Multivariate time series data typically comprises two distinct modalities: variable semantics and sampled numerical observations. Traditional time series models treat variables as anonymous statistical signals, overlooking the rich semantic information embedded in variable names and data descriptions. However, these textual descriptors often encode critical domain knowledge that is essential for robust and interpretable modeling. Here we present TimeMKG, a multimodal causal reasoning framework that elevates time series modeling from low-level signal processing to knowledge informed inference. TimeMKG employs large language models to interpret variable semantics and constructs structured Multivariate Knowledge Graphs that capture inter-variable relationships. A dual-modality encoder separately models the semantic prompts, generated from knowledge graph triplets, and the statistical patterns from historical time series. Cross-modality attention aligns and fuses these representations at the variable level, injecting causal priors into downstream tasks such as forecasting and classification, providing explicit and interpretable priors to guide model reasoning. The experiment in diverse datasets demonstrates that incorporating variable-level knowledge significantly improves both predictive performance and generalization.