🤖 AI Summary
This study addresses the limitations of conventional Bayesian group sequential designs, which tend to be overly aggressive in early interim analyses, leading to premature trial termination and biased efficacy estimates that often fail to meet regulatory standards for confirmatory trials. The authors propose two practical strategies, each requiring only a single additional tuning parameter: one employs a two-stage posterior probability threshold, and the other integrates a predictive probability monitoring mechanism. Both approaches maintain conservatism early while preserving statistical power later in the trial. By calibrating the design using an alpha-spending function, the operating characteristics align closely with O’Brien–Fleming–type boundaries. In the HYPRESS trial context, the proposed methods achieve strict Type I error control, substantially improved power, and early alpha-spending curves that closely match those of classical frequentist group sequential boundaries.
📝 Abstract
Group sequential designs (GSDs) are widely used in confirmatory trials to allow interim monitoring while preserving control of the type I error rate. In the frequentist framework, O'Brien-Fleming-type stopping boundaries dominate practice because they impose highly conservative early stopping while allowing more liberal decisions as information accumulates. Bayesian GSDs, in contrast, are most often implemented using fixed posterior probability thresholds applied uniformly at all analyses. While such designs can be calibrated to control the overall type I error rate, they do not penalise early analyses and can therefore lead to substantially more aggressive early stopping. Such behaviour can risk premature conclusions and inflation of treatment effect estimates, raising concerns for confirmatory trials. We introduce two practically implementable refinements that restore conservative early stopping in Bayesian GSDs. The first introduces a two-phase structure for posterior probability thresholds, applying more stringent criteria in the early phase of the trial and relaxing them later to preserve power. The second replaces posterior probability monitoring at interim looks with predictive probability criteria, which naturally account for uncertainty in future data and therefore suppress premature stopping. Both strategies require only one additional tuning parameter and can be efficiently calibrated. In the HYPRESS setting, both approaches achieve higher power than the conventional Bayesian design while producing alpha-spending profiles closely aligned with O'Brien-Fleming-type behaviour at early looks. These refinements provide a principled and tractable way to align Bayesian GSDs with accepted frequentist practice and regulatory expectations, supporting their robust application in confirmatory trials.