🤖 AI Summary
Existing risk-controlling prediction sets struggle to characterize tail behavior under high-risk scenarios, falling short of the stringent reliability requirements in safety-critical applications such as medical image segmentation. This work proposes OCE-RCPS, a novel framework that, for the first time, integrates optimized certainty equivalent (OCE) risk measures—such as Conditional Value-at-Risk (CVaR) and entropic risk—into prediction set construction. By combining upper confidence bound strategies with probabilistic constraint optimization, OCE-RCPS provides high-probability worst-case guarantees under user-specified risk tolerance levels. Theoretical analysis and empirical evaluations demonstrate that OCE-RCPS consistently achieves the target coverage across diverse risk measures and reliability configurations, significantly outperforming existing approaches like OCE-CRC, which lack valid probabilistic guarantees.
📝 Abstract
In safety-critical applications such as medical image segmentation, prediction systems must provide reliability guarantees that extend beyond conventional expected loss control. While risk-controlling prediction sets (RCPS) offer probabilistic guarantees on the expected risk, they fail to capture tail behavior and worst-case scenarios that are crucial in high-stakes settings. This paper introduces optimized certainty equivalent RCPS (OCE-RCPS), a novel framework that provides high-probability guarantees on general optimized certainty equivalent (OCE) risk measures, including conditional value-at-risk (CVaR) and entropic risk. OCE-RCPS leverages upper confidence bounds to identify prediction set parameters that satisfy user-specified risk tolerance levels with provable reliability. We establish theoretical guarantees showing that OCE-RCPS satisfies the desired probabilistic constraint for loss functions such as miscoverage and false negative rate. Experiments on image segmentation demonstrate that OCE-RCPS consistently meets target satisfaction rates across various risk measures and reliability configurations, while OCE-CRC fails to provide probabilistic guarantees.