Conditional Coverage Diagnostics for Conformal Prediction

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Conditional coverage assessment remains a fundamental challenge in predictive reliability analysis, as existing conformal prediction methods only guarantee marginal coverage and lack localized (i.e., conditional) coverage guarantees. To address this, we propose a diagnostic framework centered on Excess Risk of Target coverage (ERT), which—novelty—formulates conditional coverage deviation as a learnable classification risk difference problem. Our approach leverages proper-loss-driven risk estimation, modern classifiers (e.g., deep neural networks), and theoretically grounded conservative upper bounds, enabling L1/L2-distance-based quantification, explicit separation of over- and under-coverage, and analysis under non-constant target coverage levels. Empirical evaluation demonstrates that ERT achieves significantly higher detection sensitivity than baselines such as CovGap. It has been successfully deployed for reliability benchmarking across diverse conformal methods and is accompanied by an open-source, unified evaluation toolkit.

Technology Category

Application Category

📝 Abstract

Evaluating conditional coverage remains one of the most persistent challenges in assessing the reliability of predictive systems. Although conformal methods can give guarantees on marginal coverage, no method can guarantee to produce sets with correct conditional coverage, leaving practitioners without a clear way to interpret local deviations. To overcome sample-inefficiency and overfitting issues of existing metrics, we cast conditional coverage estimation as a classification problem. Conditional coverage is violated if and only if any classifier can achieve lower risk than the target coverage. Through the choice of a (proper) loss function, the resulting risk difference gives a conservative estimate of natural miscoverage measures such as L1 and L2 distance, and can even separate the effects of over- and under-coverage, and non-constant target coverages. We call the resulting family of metrics excess risk of the target coverage (ERT). We show experimentally that the use of modern classifiers provides much higher statistical power than simple classifiers underlying established metrics like CovGap. Additionally, we use our metric to benchmark different conformal prediction methods. Finally, we release an open-source package for ERT as well as previous conditional coverage metrics. Together, these contributions provide a new lens for understanding, diagnosing, and improving the conditional reliability of predictive systems.

Problem

Research questions and friction points this paper is trying to address.

Evaluates conditional coverage reliability in predictive systems

Addresses sample inefficiency and overfitting in existing metrics

Benchmarks conformal prediction methods with new ERT metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Casting conditional coverage as classification problem

Using modern classifiers for higher statistical power

Introducing ERT metrics to benchmark conformal methods

🔎 Similar Papers

No similar papers found.