Everywhere Valid Bounds on False Discovery Proportions in Conformal Inference

📅 2026-05-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

219K/year
🤖 AI Summary
This work addresses a critical limitation in existing multiple testing procedures, which control only the expected false discovery proportion (FDP) and lack high-probability guarantees for the realized FDP, particularly when data-driven thresholds are employed, thereby compromising statistical validity. The authors propose a distribution-free, finite-sample valid framework that constructs a high-probability simultaneous envelope around the empirical distribution function of conformal p-values under the null hypothesis. This approach yields, for the first time, a uniform high-probability upper bound on the FDP that holds simultaneously over all possible rejection thresholds. The method accommodates arbitrary post-hoc threshold selection and allows users to tailor the envelope’s shape to obtain tighter bounds in regions of interest. Empirical evaluations on both synthetic and real-world data demonstrate that the resulting bounds are not only valid but also substantially less conservative than those from existing methods.
📝 Abstract
Modern applications of conformal inference to multiple testing problems, such as outlier detection and candidate selection, often involve selecting test samples whose conformal p-values fall below a threshold. The quality of such methods is often measured by the false discovery proportion (FDP), defined as the fraction of incorrect selections. Existing approaches typically control the expected value of the FDP, using methods such as the Benjamini-Hochberg procedure. This approach fails to provide high-probability bounds on the realized false discovery proportion and invalidates statistical guarantees if the rejection threshold is selected after inspecting the data. This paper establishes finite-sample, distribution-free upper bounds on the FDP that hold simultaneously over all possible rejection thresholds, enabling arbitrary post hoc selection of the threshold. Simultaneous validity is achieved by constructing a high-probability envelope for the empirical distribution function of null conformal p-values by sampling from their joint distribution. Furthermore, our framework allows practitioners to modulate the envelope's shape, thereby producing tight bounds in rejection regions of primary interest. We use this flexible approach to derive simultaneous FDP upper bounds for both outlier detection and conformal selection. We demonstrate through synthetic and real-data experiments that the resulting bounds are both valid and substantially less conservative than those derived from existing approaches.
Problem

Research questions and friction points this paper is trying to address.

False Discovery Proportion
Conformal Inference
Multiple Testing
Post Hoc Threshold Selection
Distribution-Free Bounds
Innovation

Methods, ideas, or system contributions that make the work stand out.

conformal inference
false discovery proportion
simultaneous bounds
post hoc thresholding
distribution-free
🔎 Similar Papers
2024-02-262024 IEEE International Conference on Knowledge Graph (ICKG)Citations: 0