🤖 AI Summary
This study addresses the multiple testing problem in replication studies under the clinical trial “two-trial rule.” We propose a weighted Bonferroni procedure that optimizes hypothesis-specific weights using data from the original study, ensuring strict control of the family-wise error rate (FWER) while maximizing disjunctive power—the probability of rejecting at least one true positive null hypothesis. Our key innovation lies in dynamically allocating replication-stage testing weights based on statistical evidence from the original trial and designing a robust, effect-size–adaptive, power-driven weighting algorithm. Compared with standard Bonferroni and uniform weighting, our method achieves significantly higher detection power at identical sample sizes, thereby enhancing the reliability of replication conclusions and decision-making efficiency. The approach provides a novel paradigm for multi-endpoint clinical trials prioritizing reproducibility—balancing statistical rigor with practical applicability.
📝 Abstract
Replication studies for scientific research are an important part of ensuring the reliability and integrity of experimental findings. In the context of clinical trials, the concept of replication has been formalised by the'two-trials'rule, where two pivotal studies are required to show positive results before a drug can be approved. In experiments testing multiple hypotheses simultaneously, control of the overall familywise error rate (FWER) is additionally required in many contexts. The well-known Bonferroni procedure controls the FWER, and a natural extension is to introduce weights into this procedure to reflect the a-priori importance of hypotheses or to maximise some measure of the overall power of the experiment. In this paper, we consider analysing a replication study using an optimal weighted Bonferroni procedure, with the weights based on the results of the original study that is being replicated and the optimality criterion being to maximise the disjunctive power of the trial (the power to reject at least one non-null hypothesis). We show that using the proposed procedure can lead to a substantial increase in the disjunctive power of the replication study, and is robust to changes in the effect sizes between the two studies.