🤖 AI Summary
This study addresses the poor calibration of prediction intervals in few-shot time series forecasting. We propose a calibration-preserving framework that integrates time series foundation models (TSFMs) with conformal prediction. Leveraging TSFMs’ strong zero-shot generalization and high point-forecast accuracy, our approach significantly enhances the reliability and calibration of prediction intervals under data-scarce conditions: (i) more accurate point forecasts directly yield higher-quality intervals; and (ii) reduced dependency on task-specific training data stabilizes the conformal calibration procedure. Experiments under low-data regimes demonstrate that our method outperforms classical statistical models (e.g., ETS, ARIMA) and gradient-boosting approaches (e.g., LightGBM) across key metrics—including empirical coverage probability, interval width, and calibration error. The implementation is publicly available, validating the method’s effectiveness and practicality in data-constrained settings.
📝 Abstract
The zero-shot capabilities of foundation models (FMs) for time series forecasting offer promising potentials in conformal prediction, as most of the available data can be allocated to calibration. This study compares the performance of Time Series Foundation Models (TSFMs) with traditional methods, including statistical models and gradient boosting, within a conformal prediction setting. Our findings highlight two key advantages of TSFMs. First, when the volume of data is limited, TSFMs provide more reliable conformalized prediction intervals than classic models, thanks to their superior predictive accuracy. Second, the calibration process is more stable because more data are used for calibration. Morever, the fewer data available, the more pronounced these benefits become, as classic models require a substantial amount of data for effective training. These results underscore the potential of foundation models in improving conformal prediction reliability in time series applications, particularly in data-constrained cases. All the code to reproduce the experiments is available.