A General Stability Approach to False Discovery Rate Control

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Existing false discovery rate (FDR) control methods—such as Model-X knockoffs and data splitting—suffer from instability and irreproducibility due to algorithmic randomness in feature selection. Method: We propose FDR Stabilizer, a general framework that repeatedly executes a base FDR procedure to derive a consensus feature ranking, constructs a stabilized relaxed e-value, and applies the e-BH procedure to yield the final selected set. Contribution/Results: This work establishes the first theoretical foundation for FDR stability, rigorously proving almost-sure convergence of the selection set to a deterministic limit. It guarantees finite-sample FDR control and asymptotically vanishing statistical power loss. The framework is agnostic to the underlying FDR method and thus unifies multiple base procedures. Experiments on synthetic and real-world datasets demonstrate substantial improvements in selection stability and discovery reliability over state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Stability and reproducibility are essential considerations in various applications of statistical methods. False Discovery Rate (FDR) control methods are able to control false signals in scientific discoveries. However, many FDR control methods, such as Model-X knockoff and data-splitting approaches, yield unstable results due to the inherent randomness of the algorithms. To enhance the stability and reproducibility of statistical outcomes, we propose a general stability approach for FDR control in feature selection and multiple testing problems, named FDR Stabilizer. Taking feature selection as an example, our method first aggregates feature importance statistics obtained by multiple runs of the base FDR control procedure into a consensus ranking. Then, we construct a stabilized relaxed e-value for each feature and apply the e-BH procedure to these stabilized e-values to obtain the final selection set. We theoretically derive the finite-sample bounds for the FDR and the power of our method, and show that our method asymptotically controls the FDR without power loss. Moreover, we establish the stability of the proposed method, showing that the stabilized selection set converges to a deterministic limit as the number of repetitions increases. Extensive numerical experiments and applications to real datasets demonstrate that the proposed method generally outperforms existing alternatives.

Problem

Research questions and friction points this paper is trying to address.

Enhances stability and reproducibility of FDR control methods

Addresses instability from randomness in feature selection algorithms

Controls false discovery rates without sacrificing statistical power

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aggregates feature importance statistics into consensus ranking

Constructs stabilized relaxed e-values for each feature

Applies e-BH procedure to stabilized e-values for final selection

🔎 Similar Papers

Adaptive control of dynamic networks