Confidence on the Focal: Conformal Prediction with Selection-Conditional Coverage

📅 2024-03-06

📈 Citations: 7

✨ Influential: 0

🤖 AI Summary

Under data-driven selection, conventional prediction intervals fail to guarantee marginal coverage for the selected units—compromising reliability for focal samples. Method: We propose the first finite-sample exact coverage framework for post-selection inference, extending Mondrian conformal prediction to multiple test samples and non-equivariant models while accommodating arbitrary permutation-invariant selection rules. Our approach integrates conditional randomization tests, top-K or optimization-driven selection, conformal p-values, and preliminary screening prediction sets to enable efficient computation. Contribution/Results: Evaluated on drug discovery and health risk prediction tasks, our method substantially improves empirical coverage for focal units, ensuring statistically valid inference in real-world decision-making scenarios. This provides the first provably exact finite-sample coverage guarantee for post-selection prediction intervals under general selection mechanisms.

Technology Category

Application Category

📝 Abstract

Conformal prediction builds marginally valid prediction intervals that cover the unknown outcome of a randomly drawn test point with a prescribed probability. However, in practice, data-driven methods are often used to identify specific test unit(s) of interest, requiring uncertainty quantification tailored to these focal units. In such cases, marginally valid conformal prediction intervals may fail to provide valid coverage for the focal unit(s) due to selection bias. This paper presents a general framework for constructing a prediction set with finite-sample exact coverage, conditional on the unit being selected by a given procedure. The general form of our method accommodates arbitrary selection rules that are invariant to the permutation of the calibration units, and generalizes Mondrian Conformal Prediction to multiple test units and non-equivariant classifiers. We also work out computationally efficient implementation of our framework for a number of realistic selection rules, including top-K selection, optimization-based selection, selection based on conformal p-values, and selection based on properties of preliminary conformal prediction sets. The performance of our methods is demonstrated via applications in drug discovery and health risk prediction.

Problem

Research questions and friction points this paper is trying to address.

Ensures valid coverage for selected focal units

Addresses selection bias in conformal prediction

Generalizes to multiple test units and classifiers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditional coverage for selected focal units

Permutation-invariant arbitrary selection rules

Efficient implementation for realistic selection scenarios

🔎 Similar Papers

No similar papers found.

Authors to Follow