🤖 AI Summary
This work addresses the performance degradation of adaptive optimizers such as Adam in the presence of distribution shifts within non-stationary time series. To mitigate this issue, the authors propose TS_Adam, a lightweight variant of Adam that enhances responsiveness to dynamic loss landscapes by removing the second-moment bias correction term and simplifying the learning rate computation—without introducing any additional hyperparameters. TS_Adam integrates seamlessly into mainstream time series models, such as MICN, enabling end-to-end training. Experimental results on the ETT benchmark demonstrate significant improvements in forecasting accuracy: compared to Adam, TS_Adam reduces average MSE by 12.8% and MAE by 5.7%, exhibiting superior adaptability and robustness across both short- and long-horizon prediction tasks.
📝 Abstract
Time-series forecasting often faces challenges from non-stationarity, particularly distributional drift, where the data distribution evolves over time. This dynamic behavior can undermine the effectiveness of adaptive optimizers, such as Adam, which are typically designed for stationary objectives. In this paper, we revisit Adam in the context of non-stationary forecasting and identify that its second-order bias correction limits responsiveness to shifting loss landscapes. To address this, we propose TS_Adam, a lightweight variant that removes the second-order correction from the learning rate computation. This simple modification improves adaptability to distributional drift while preserving the optimizer core structure and requiring no additional hyperparameters. TS_Adam integrates easily into existing models and consistently improves performance across long- and short-term forecasting tasks. On the ETT datasets with the MICN model, it achieves an average reduction of 12.8% in MSE and 5.7% in MAE compared to Adam. These results underscore the practicality and versatility of TS_Adam as an effective optimization strategy for real-world forecasting scenarios involving non-stationary data. Code is available at: https://github.com/DD-459-1/TS_Adam.