🤖 AI Summary
This study addresses the lack of a statistically powerful procedure for block-structured multiple hypothesis testing under strong family-wise error rate (FWER) control. The authors propose the first framework that achieves optimal statistical power while guaranteeing strong FWER control, applicable to settings such as gatekeeping trials, dose-finding studies, and multi-tissue eQTL mapping. The method delivers the first provably power-optimal solution for blocks of size three, integrating a KKT-condition-driven marginal power balancing algorithm, closure principle acceleration, sample-splitting plug-in estimation, and a Sidák-type refinement to ensure finite-sample validity and robustness to heterogeneous block structures and unknown alternative distributions. Empirical evaluations demonstrate 1.4–1.7× higher power than the strongest existing baselines across diverse correlation and sparsity regimes, and in real-world eQTL and A/B testing data, it identifies an order-of-magnitude more significant full-block discoveries.
📝 Abstract
Structured multiple-testing problems (gatekeeping trials, dose-finding, multi-tissue eQTL mapping, bundled-challenger A/B experiments) organize hypotheses into design-imposed blocks and demand strong family-wise error rate (FWER) control for confirmatory claims. Practitioners currently use objective-agnostic stepwise rules (Bonferroni, Holm, Hochberg, Hommel), closed-testing and graphical extensions, or hierarchical and resampling methods; none is power-optimal within the block-separable class these designs induce. We introduce BOOST (Block-Optimal Objective-driven Strong-FWER Testing), the power-optimal strong-FWER procedure for block size three, with three guarantees: (i) finite-sample strong-FWER validity at $O(K)$ cost (versus $O(K^2)$ for general closed testing) without independence assumptions, with a strict Sidak improvement under cross-block independence; (ii) power-optimal allocation across heterogeneous blocks via an equalized-marginal KKT condition, solvable by bisection in $O(B\log(1/\varepsilon))$; and (iii) a sample-split plug-in variant for unknown alternative density $g$, attaining $α$-control up to $O(B_T \mathbb E\|g-\widehat g\|_\infty)$ inflation with per-hypothesis power deficit independent of $B_T$. Simulations across independent, equicorrelated, sparse, and mis-specified regimes show 1.4-1.7$\times$ power gains over the strongest existing baseline at calibrated FWER. On two published datasets (BLUEPRINT cross-lineage cis-eQTL and Upworthy bundled-challenger A/B experiments), BOOST certifies an order of magnitude more full-block discoveries than existing baselines at controlled FWER.