🤖 AI Summary
Fine-tuning temporal foundation models faces challenges in adaptability due to multi-frequency, multi-channel, and variable-length historical/predictive sequences—particularly limiting long-horizon forecasting performance.
Method: We propose a parameter-efficient fine-tuning framework comprising: (1) a gated dynamic importance mechanism ensuring temporal consistency of LoRA parameters along the time dimension; (2) a lightweight reconstruction-based prediction head that decouples feature extraction from task-specific mapping, drastically reducing parameter count; and (3) multi-task joint training across long-/short-term forecasting and anomaly detection to enhance generalization.
Results: Evaluated on multiple benchmarks, our method reduces trainable parameters by over 90% compared to full fine-tuning and linear probing, while maintaining or improving long-horizon forecasting accuracy. This demonstrates superior parameter efficiency, robustness, and scalability for diverse temporal modeling tasks.
📝 Abstract
We propose an efficient fine-tuning method for time series foundation models, termed TRACE: Time Series Parameter Efficient Fine-tuning. While pretrained time series foundation models are gaining popularity, they face the following challenges: (1) Unlike natural language tasks, time series data vary in frequency, channel numbers, historical/prediction lengths. For long-term forecasting tasks in particular, tailored fine-tuning can significantly enhance performance.(2) Existing parameter-efficient tuning methods like LoRA remain applicable but require adaptation to temporal characteristics. To address these challenges, our TRACE framework introduces two key innovations: (1) Gated DSIC (Gated Dynamic Simulation Importance Calculation), an unbiased LoRA module importance selection mechanism that ensures conditional parameter consistency before and after masking. Experiments demonstrate that Gated DSIC outperforms common fine-tuning. (2) Reconstructed prediction heads for long-term forecasting tasks, which achieve comparable or superior performance to linear probing heads while drastically reducing parameter counts. Extensive experiments on long-/short-term forecasting and anomaly detection tasks across diverse datasets, coupled with ablation studies, validate the effectiveness of our method.