Weighted Holm Procedures: Theory, Properties, and Recommendations

📅 2026-04-21

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study addresses the challenge of enhancing statistical power while controlling the family-wise error rate (FWER) in multiple hypothesis testing when hypotheses carry unequal importance. The authors systematically compare two weighted Holm procedures: the Weighted Holm based on weighted p-value ordering (WHP) and the Weighted Holm based on unweighted p-value ordering (WAP). Through closure principles, graphical representations, and adjusted p-value derivations, they theoretically demonstrate that WHP uniformly dominates WAP—WHP satisfies both monotonicity and admissibility, whereas WAP, although consonant, lacks monotonicity and is optimal only under restrictive conditions. Both theoretical analysis and simulations confirm that WHP achieves superior FWER control and higher average power, offering a clear recommendation for practical implementation.

Technology Category

Application Category

📝 Abstract

In many statistical applications, particularly in clinical studies, hypotheses may carry different levels of importance, motivating the use of weighted multiple testing procedures (wMTPs) to control the familywise error rate (FWER). Among these approaches, two weighted Holm procedures are commonly used: the weighted Holm procedure (WHP), which is based on ordered weighted $p$-values, and the weighted alternative Holm procedure (WAP), which relies on ordered raw $p$-values. This paper provides a systematic comparison of these two procedures, along with practical recommendations for their use. We first examine their corresponding closed testing procedures (CTPs) and show that WHP is uniformly more powerful than WAP. We further investigate their structural properties, demonstrating that WAP, while consonant, lacks monotonicity. To facilitate communication with non-statisticians, we introduce graphical representations of both procedures using a common initial graph and distinct updating strategies. In addition, we derive adjusted $p$-values and adjusted weighted $p$-values for both methods. Finally, we establish an optimality result: WHP cannot be improved by enlarging any of its critical values without violating FWER control, whereas WAP is optimal only under specific conditions. Simulation studies support these theoretical findings and highlight the superior FWER control and average power of WHP.

Problem

Research questions and friction points this paper is trying to address.

weighted Holm procedure

familywise error rate

multiple testing

hypothesis weighting

FWER control

Innovation

Methods, ideas, or system contributions that make the work stand out.

weighted Holm procedure

familywise error rate

closed testing procedure