🤖 AI Summary
Uncertainty quantification in time series forecasting is challenging due to violations of the exchangeability assumption caused by temporal dependence. Method: This paper investigates theoretical guarantees of split conformal prediction under non-i.i.d. data, introducing the “switching coefficient” to quantify the degree of departure from exchangeability induced by temporal dependence. Contribution/Results: For stationary β-mixing processes, we derive the first exact upper bound on coverage shortfall for split conformal prediction. Leveraging β-mixing conditions and empirical process theory, we rigorously establish that the method achieves approximate 1−α coverage even under strong temporal dependence, with the deviation controlled jointly by dependence strength and sample size. Our results provide the first interpretable and computable theoretical foundation for distribution-free predictive intervals under dependent data, significantly extending the applicability of conformal prediction beyond the i.i.d. setting.
📝 Abstract
We consider the problem of uncertainty quantification for prediction in a time series: if we use past data to forecast the next time point, can we provide valid prediction intervals around our forecasts? To avoid placing distributional assumptions on the data, in recent years the conformal prediction method has been a popular approach for predictive inference, since it provides distribution-free coverage for any iid or exchangeable data distribution. However, in the time series setting, the strong empirical performance of conformal prediction methods is not well understood, since even short-range temporal dependence is a strong violation of the exchangeability assumption. Using predictors with "memory" -- i.e., predictors that utilize past observations, such as autoregressive models -- further exacerbates this problem. In this work, we examine the theoretical properties of split conformal prediction in the time series setting, including the case where predictors may have memory. Our results bound the loss of coverage of these methods in terms of a new "switch coefficient", measuring the extent to which temporal dependence within the time series creates violations of exchangeability. Our characterization of the coverage probability is sharp over the class of stationary, $β$-mixing processes. Along the way, we introduce tools that may prove useful in analyzing other predictive inference methods for dependent data.