SSDA: Bridging Spectral and Structural Gaps via Dual Adaptation for Vision-Based Time Series Forecasting

📅 2026-05-10

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the limited transferability of pretrained vision models to time series forecasting, which stems from a significant spectral and structural discrepancy between rendered time series images and natural images—a gap this study formally identifies as the “spectral-structural gap.” To bridge this gap, the authors propose SSDA, a dual-branch network that enhances spectral characteristics via a 2D FFT–based spectral magnitude aligner and injects temporal positional information through a structure-guided low-rank adaptation (SG-LoRA) module. SSDA further integrates position-aware encoding with an adaptive multi-branch fusion mechanism. Evaluated across seven real-world datasets, SSDA consistently outperforms state-of-the-art large vision model (LVM) and large language model (LLM) baselines under both full-data and few-shot settings, effectively unlocking the potential of large vision models for time series prediction.

📝 Abstract

Large vision models (LVMs) have recently proven to be surprisingly effective time series forecasters, simply by rendering temporal data as images. This success, how ever, rests on a largely unexamined premise: the rendered time series images are sufficiently close to natural images for knowledge in pre-trained models to transfer effectively. We argue that two gaps still remain, i.e., spectral and structural gaps, fundamentally limiting the potential of LVMs for time series forecasting. Spectrally, we systematically reveal that rendered time series images exhibit a markedly shallower power spectrum than the natural images LVMs are pre-trained to recognize. Structurally, reshaping 1D temporal sequences into 2D grids fabricates spurious spatial adjacencies while severing genuine temporal continuities, misleading the spatial inductive biases of pre-trained LVMs. To bridge these gaps, we propose SSDA, a dual-branch network that spectrally and structurally adapts to unlock the full potential of LVMs for time series forecasting. At the data level, a Spectral Magnitude Aligner (SMA) applies 2D FFT to selectively enhance the magnitude spectrum toward natural-image statistics while preserving phase. At the model level, a Structural-Guided Low-Rank Adaptation (SG-LoRA) injects position-aware temporal encodings into patch embeddings and adapts at tention via low-rank updates. The two branches are further adaptively fused to produce the final forecast. Extensive experiments on seven real-world benchmarks demonstrate that SSDA consistently outperforms strong LVM- and LLM-based baselines under both full-shot and few-shot settings. Code is publicly available at https://anonymous.4open.science/r/SSDA-8C5B.

Problem

Research questions and friction points this paper is trying to address.

spectral gap

structural gap

vision-based time series forecasting

large vision models

temporal continuity

Innovation

Methods, ideas, or system contributions that make the work stand out.

spectral adaptation

structural adaptation

vision-based time series forecasting