🤖 AI Summary
To address the challenges of modeling long-range temporal dependencies and fusing multi-source information in time series, this paper proposes a novel time-series-to-vision self-transformation paradigm: raw sequences are mapped into multi-view image representations (e.g., Gramian Angular Field, Markov Transition Field), and discriminative visual features are extracted using a pre-trained Vision Transformer (ViT). Subsequently, a cross-modal conditional Latent Diffusion Model (LDM) is designed to jointly model visual priors and sequential dynamics—without requiring external image inputs. Key innovations include: (i) the first purely time-series-driven visual representation generation mechanism; (ii) a cross-modal conditional LDM architecture; and (iii) a feature-level fusion module. Extensive experiments on multiple benchmark datasets demonstrate that our method consistently outperforms state-of-the-art models—including Informer, Autoformer, and PatchTST—with an average 12.7% reduction in MAE. Notably, it exhibits superior generalization and robustness in long-horizon forecasting (h ≥ 96).
📝 Abstract
Diffusion models have recently emerged as powerful frameworks for generating high-quality images. While recent studies have explored their application to time series forecasting, these approaches face significant challenges in cross-modal modeling and transforming visual information effectively to capture temporal patterns. In this paper, we propose LDM4TS, a novel framework that leverages the powerful image reconstruction capabilities of latent diffusion models for vision-enhanced time series forecasting. Instead of introducing external visual data, we are the first to use complementary transformation techniques to convert time series into multi-view visual representations, allowing the model to exploit the rich feature extraction capabilities of the pre-trained vision encoder. Subsequently, these representations are reconstructed using a latent diffusion model with a cross-modal conditioning mechanism as well as a fusion module. Experimental results demonstrate that LDM4TS outperforms various specialized forecasting models for time series forecasting tasks.