Demonstration Experiments

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study addresses the problem of efficiently verifying that at least one intervention yields a positive effect on a subpopulation in adaptive experiments, without requiring precise effect size estimation. The authors formalize this demonstration-oriented objective as a multi-armed bandit problem with a signal-to-noise ratio reward and propose two inference strategies: aggregating information across promising interventions and conducting time-uniform multiple testing of intervention means. A key theoretical contribution is the first establishment of a moderate deviation principle for sequential t-statistics, enabling valid concurrent hypothesis testing over arbitrarily many hypotheses at any stopping time. The resulting adaptive allocation rule achieves both a logarithmic regret bound and rigorous statistical inference guarantees, substantially enhancing experimental efficiency and reliability.

Technology Category

Application Category

📝 Abstract

Adaptive experiments are used extensively in online platforms, healthcare and biotechnology, and a variety of other settings. In many of these applications, the main goal is not to precisely estimate a treatment effect, but to demonstrate that at least one candidate intervention yields a positive effect, for some subpopulation, on some measured outcome. We formalize this objective in a multi-armed bandit framework and develop inference procedures for testing whether any arm's mean exceeds a given threshold under fully adaptive sampling: one which pools information across promising arms, and one which corresponds to time-uniform multiple inference on the means of individual arms. To support the latter, we establish a moderate deviations principle for the sequential t-statistic, justifying anytime-valid testing of a large number of hypotheses concurrently. To illustrate how adaptive design can target the proposed statistics, we recast experimental design as bandit optimization where an arm's reward corresponds to its signal-to-noise ratio, and analyze an adaptive allocation rule for which we establish a logarithmic regret bound.

Problem

Research questions and friction points this paper is trying to address.

adaptive experiments

treatment effect

multi-armed bandit

hypothesis testing

subpopulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive experimentation

multi-armed bandit

anytime-valid inference