Anytime-Valid Linear Models and Regression Adjusted Causal Inference in Randomized Experiments

📅 2022-10-16
📈 Citations: 5
Influential: 1
📄 PDF
🤖 AI Summary
Traditional linear-model t/F-tests assume fixed sample sizes, rendering them unsuitable for sequential A/B testing requiring continuous monitoring and early stopping—leading to uncontrolled Type-I error inflation. This paper proposes an anytime-valid causal inference framework under linear regression adjustment, introducing the first closed-form anytime-valid F-tests and confidence sequences for both parametric and nonparametric settings. Without imposing strong modeling assumptions, the method guarantees uniform Type-I error control and valid confidence coverage throughout the entire sequential experiment under standard randomized designs. All test statistics are directly computable from standard regression outputs, enabling real-time significance assessment and dynamic confidence interval updating. Deployed on Netflix’s industrial-scale A/B testing platform, the method supports regression-adjusted sequential analysis using pre-treatment covariates, effectively mitigating p-hacking.
📝 Abstract
Linear regression adjustment is commonly used to analyse randomised controlled experiments due to its efficiency and robustness against model misspecification. Current testing and interval estimation procedures leverage the asymptotic distribution of such estimators to provide Type-I error and coverage guarantees that hold only at a single sample size. Here, we develop the theory for the anytime-valid analogues of such procedures, enabling linear regression adjustment in the sequential analysis of randomised experiments. We first provide sequential $F$-tests and confidence sequences for the parametric linear model, which provide time-uniform Type-I error and coverage guarantees that hold for all sample sizes. We then relax all linear model parametric assumptions in randomised designs and provide nonparametric model-free sequential tests and confidence sequences for treatment effects. This formally allows experiments to be continuously monitored for significance, stopped early, and safeguards against statistical malpractices in data collection. A particular feature of our results is their simplicity. Our test statistics and confidence sequences all emit closed-form expressions, which are functions of statistics directly available from a standard linear regression table. We illustrate our methodology with the sequential analysis of software A/B experiments at Netflix, performing regression adjustment with pre-treatment outcomes.
Problem

Research questions and friction points this paper is trying to address.

Develop anytime-valid inference for linear models in sequential settings
Provide sequential tests and confidence sets with uniform guarantees
Enable regression-adjusted causal inference in randomized experiments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Anytime-valid linear models for sequential settings
Sequential analogues of classical tests and confidence sets
Heteroskedasticity-robust asymptotic sequential tests
🔎 Similar Papers
No similar papers found.
Michael Lindon
Michael Lindon
Netflix
StatisticsBayesian
D
Dae Woong Ham
Ross School of Business, Ann Arbor, MI
M
Martin Tingley
Microsoft, Redmond, WA
Iavor Bojinov
Iavor Bojinov
Harvard School of Business, Harvard, MA