🤖 AI Summary
Pretrained time-series models suffer from catastrophic forgetting when continuously adapting to new data distributions—especially under data-privacy constraints where original training data is inaccessible. To address this, we propose a frequency-aware continual adaptation framework: (1) a multi-band replay mechanism built upon wavelet decomposition generates augmented samples while preserving trend characteristics; (2) latent-space consistency constraints and semantic alignment ensure compact, coherent knowledge retention; and (3) self-supervised contrastive learning is integrated to enhance generalization. Experiments demonstrate that our method achieves up to 46.9% and 46.8% reductions in MAE and MSE on new tasks, respectively, while improving performance on old tasks by up to 5.7% (MAE) and 6.0% (MSE). Remarkably, it attains superior performance over state-of-the-art few-shot continual learning methods using only 5% synthetic data.
📝 Abstract
Pre-trained models have demonstrated exceptional generalization capabilities in time-series forecasting; however, adapting them to evolving data distributions remains a significant challenge. A key hurdle lies in accessing the original training data, as fine-tuning solely on new data often leads to catastrophic forgetting. To address this issue, we propose Replay Tuning (R-Tuning), a novel framework designed for the continual adaptation of pre-trained time-series models. R-Tuning constructs a unified latent space that captures both prior and current task knowledge through a frequency-aware replay strategy. Specifically, it augments model-generated samples via wavelet-based decomposition across multiple frequency bands, generating trend-preserving and fusion-enhanced variants to improve representation diversity and replay efficiency. To further reduce reliance on synthetic samples, R-Tuning introduces a latent consistency constraint that aligns new representations with the prior task space. This constraint guides joint optimization within a compact and semantically coherent latent space, ensuring robust knowledge retention and adaptation. Extensive experimental results demonstrate the superiority of R-Tuning, which reduces MAE and MSE by up to 46.9% and 46.8%, respectively, on new tasks, while preserving prior knowledge with gains of up to 5.7% and 6.0% on old tasks. Notably, under few-shot settings, R-Tuning outperforms all state-of-the-art baselines even when synthetic proxy samples account for only 5% of the new task dataset.