Assumption-lean weak limits and tests for two-stage adaptive experiments

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Adaptive experiments are increasingly deployed in real-world settings, yet their statistical inference foundations remain underdeveloped—particularly for two-stage designs, where the asymptotic behavior of the weighted inverse probability weighting (WIPW) estimator lacks a unified characterization across signal regimes. This paper establishes the first weak convergence theory under minimally restrictive assumptions, breaking away from conventional strong parametric or stability conditions, and systematically uncovers novel statistical phase transitions intrinsic to adaptive experimentation. We propose a falsifiable plug-in bootstrap procedure tailored to non-normal limiting distributions. Theoretically, we prove the asymptotic validity of the WIPW estimator under *any* adaptive design. Extensive simulations and semi-synthetic experiments demonstrate its robustness in small samples and under dynamic treatment allocation, as well as its high statistical power.

Technology Category

Application Category

📝 Abstract
Adaptive experiments are becoming increasingly popular in real-world applications for effectively maximizing in-sample welfare and efficiency by data-driven sampling. Despite their growing prevalence, however, the statistical foundations for valid inference in such settings remain underdeveloped. Focusing on two-stage adaptive experimental designs, we address this gap by deriving new weak convergence results for mean outcomes and their differences. In particular, our results apply to a broad class of estimators, the weighted inverse probability weighted (WIPW) estimators. In contrast to prior works, our results require significantly weaker assumptions and sharply characterize phase transitions in limiting behavior across different signal regimes. Through this common lens, our general results unify previously fragmented results under the two-stage setup. To address the challenge of potential non-normal limits in conducting inference, we propose a computationally efficient and provably valid plug-in bootstrap method for hypothesis testing. Our results and approaches are sufficiently general to accommodate various adaptive experimental designs, including batched bandit and subgroup enrichment experiments. Simulations and semi-synthetic studies demonstrate the practical value of our approach, revealing statistical phenomena unique to adaptive experiments.
Problem

Research questions and friction points this paper is trying to address.

Develop weak convergence results for two-stage adaptive experiments
Propose valid inference methods for non-normal limits
Unify fragmented results under two-stage adaptive designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weak convergence results for adaptive experiments
Plug-in bootstrap method for non-normal limits
Broad class of WIPW estimators with weak assumptions