🤖 AI Summary
Existing probabilistic time series forecasting methods suffer from two key limitations: weak distributional modeling capability and misalignment between training objectives and evaluation metrics—particularly the Continuous Ranked Probability Score (CRPS). To address these, this paper proposes RDIT, a Residual-Diffusion-based Interval Transformer framework. First, we theoretically establish that a combination of a strong point predictor and zero-mean Gaussian residuals minimizes CRPS. Building on this insight, RDIT employs a plug-and-play residual-conditional diffusion model, where a bidirectional Mamba network efficiently captures residual distribution dynamics. We further introduce CRPS-guided optimization and an implicit diffusion sampling algorithm for efficient inference. Extensive experiments across eight multivariate benchmarks demonstrate that RDIT achieves statistically significant improvements: lower CRPS, faster inference speed, and higher prediction interval coverage—outperforming state-of-the-art baselines across all metrics.
📝 Abstract
Probabilistic Time Series Forecasting (PTSF) plays a critical role in domains requiring accurate and uncertainty-aware predictions for decision-making. However, existing methods offer suboptimal distribution modeling and suffer from a mismatch between training and evaluation metrics. Surprisingly, we found that augmenting a strong point estimator with a zero-mean Gaussian, whose standard deviation matches its training error, can yield state-of-the-art performance in PTSF. In this work, we propose RDIT, a plug-and-play framework that combines point estimation and residual-based conditional diffusion with a bidirectional Mamba network. We theoretically prove that the Continuous Ranked Probability Score (CRPS) can be minimized by adjusting to an optimal standard deviation and then derive algorithms to achieve distribution matching. Evaluations on eight multivariate datasets across varied forecasting horizons demonstrate that RDIT achieves lower CRPS, rapid inference, and improved coverage compared to strong baselines.