Adaptive Neyman Allocation

📅 2023-09-15

🏛️ ACM Conference on Economics and Computation

📈 Citations: 12

✨ Influential: 2

career value

228K/year

🤖 AI Summary

This paper addresses the optimal sample allocation problem in multi-stage randomized controlled trials when group variances (treatment and control) are unknown. We propose Adaptive Neyman Allocation: a method that dynamically estimates inter-group variances from early-stage data and continuously optimizes subsequent-stage assignment ratios to minimize the variance of the treatment effect estimator. We introduce, for the first time, a competitive analysis framework into experimental design; theoretically, our algorithm achieves an Ω(√log M) competitive ratio over T total samples and M stages, approaching the information-theoretic lower bound. Our approach breaks from the conventional fixed 1:1 allocation paradigm by enabling variance-driven, stage-wise optimal allocation. Empirical evaluation on real-world A/B tests at a major social platform demonstrates its effectiveness: under finite-stage constraints, estimation accuracy improves by over 30% compared to standard designs.

📝 Abstract

Why are field experiments usually conducted with half-treated and half-control? One answer, dating back to Neyman (1934), is that experimenters usually believe the treated and control groups to have the same level of variability. When the treated and control groups have different levels of variability, such as an intervention in a social experiment triggers heterogeneity or even polarization of the outcomes, the seminal work of Neyman (1934) recommends an unequal allocation: the sizes of the treated and control groups should be proportional to their respective standard deviations. This approach has later on been recognized as "Neyman allocation." Albeit useful, a challenge in using Neyman allocation arises when the standard deviations of the treated and control groups are unknown in advance. Fortunately, the multi-stage nature of the wide applications allows the use of earlier stage observations to estimate the standard deviations. If the earlier stage observations suggest a higher level of variability in one group, more experimental subjects will be randomly allocated to the same group in the later stages, so that the confidence intervals of the average outcomes are roughly equal between the two groups. We refer to this approach as "adaptive Neyman allocation." In this paper, we study the optimal adaptive Neyman allocation problem. To study this problem, we introduce the competitive analysis framework into experimental designs. When a total of T experimental subjects are enrolled over M stages, our proposed algorithm is [EQUATION] competitive against a hindsight benchmark that knew the standard deviations in advance. This result nearly matches the information-theoretic limit of conducting experiments. Using online A/B testing data from a social media site, we demonstrate the effectiveness of our adaptive Neyman allocation algorithm, highlighting its practicality especially when applied with only a limited number of stages. A full version of this paper can be found at https://arxiv.org/abs/2309.08808

Problem

Research questions and friction points this paper is trying to address.

Adaptive Neyman Allocation addresses unknown standard deviations in experiments

It proposes a multi-stage algorithm using earlier data for allocation

The method improves statistical power in A/B testing and trials

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-stage adaptive Neyman allocation algorithm

Competitive analysis framework for experiments

Theory for estimation and inference

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE