🤖 AI Summary
Standard online conformal prediction (OCP) suffers from two key limitations in time-series forecasting: (1) reliance on simplistic nonconformity scores defined solely in the output space, and (2) uniform weighting of historical data, rendering it ill-suited for non-stationary processes and distributional shifts. To address these, we propose the first neural-feature-space extension of OCP, incorporating an attention mechanism to adaptively weight past samples based on both task relevance and evolving distribution dynamics. Our method integrates pretrained feature representations, feature-space nonconformity scoring, and dynamic quantile estimation—guaranteeing long-term marginal coverage theoretically. Experiments across diverse synthetic and real-world time-series benchmarks demonstrate that our approach reduces average prediction interval width by 88% while strictly maintaining the target coverage level. These results validate the effectiveness and robustness of feature-space calibration coupled with attention-based adaptive weighting.
📝 Abstract
Online conformal prediction (OCP) wraps around any pre-trained predictor to produce prediction sets with coverage guarantees that hold irrespective of temporal dependencies or distribution shifts. However, standard OCP faces two key limitations: it operates in the output space using simple nonconformity (NC) scores, and it treats all historical observations uniformly when estimating quantiles. This paper introduces attention-based feature OCP (AFOCP), which addresses both limitations through two key innovations. First, AFOCP operates in the feature space of pre-trained neural networks, leveraging learned representations to construct more compact prediction sets by concentrating on task-relevant information while suppressing nuisance variation. Second, AFOCP incorporates an attention mechanism that adaptively weights historical observations based on their relevance to the current test point, effectively handling non-stationarity and distribution shifts. We provide theoretical guarantees showing that AFOCP maintains long-term coverage while provably achieving smaller prediction intervals than standard OCP under mild regularity conditions. Extensive experiments on synthetic and real-world time series datasets demonstrate that AFOCP consistently reduces the size of prediction intervals by as much as $88%$ as compared to OCP, while maintaining target coverage levels, validating the benefits of both feature-space calibration and attention-based adaptive weighting.