π€ AI Summary
This paper addresses the fair and efficient online estimation of means across multiple populations, aiming to balance estimation accuracy across groups during dynamic sampling and prevent systematic under-sampling of minority or hard-to-estimate subpopulations. To handle unknown and non-stationary distributions, we propose Variance-UCBβthe first active learning framework for multi-population adaptive sampling that incorporates variance-based upper confidence bounds. We develop a general theoretical analysis framework and derive, for the first time, a tight upper bound on the variance-norm regret, substantially improving upon existing methods. Our theoretical guarantees accommodate challenging settings including population heterogeneity and distributional non-stationarity, and naturally extend to novel objective functions and distribution families. Extensive experiments demonstrate the robust effectiveness of Variance-UCB in applications such as online experimentation and adaptive clinical trials.
π Abstract
We study a fundamental learning problem over multiple groups with unknown data distributions, where an analyst would like to learn the mean of each group. Moreover, we want to ensure that this data is collected in a relatively fair manner such that the noise of the estimate of each group is reasonable. In particular, we focus on settings where data are collected dynamically, which is important in adaptive experimentation for online platforms or adaptive clinical trials for healthcare. In our model, we employ an active learning framework to sequentially collect samples with bandit feedback, observing a sample in each period from the chosen group. After observing a sample, the analyst updates their estimate of the mean and variance of that group and chooses the next group accordingly. The analyst's objective is to dynamically collect samples to minimize the collective noise of the estimators, measured by the norm of the vector of variances of the mean estimators. We propose an algorithm, Variance-UCB, that sequentially selects groups according to an upper confidence bound on the variance estimate. We provide a general theoretical framework for providing efficient bounds on learning from any underlying distribution where the variances can be estimated reasonably. This framework yields upper bounds on regret that improve significantly upon all existing bounds, as well as a collection of new results for different objectives and distributions than those previously studied.