Experimenting, Fast and Slow: Bayesian Optimization of Long-term Outcomes with Online Experiments

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Short-term A/B tests in online experimentation are vulnerable to time-varying nonstationarity, limiting their ability to accurately estimate long-term system effects; conversely, long-term experiments suffer from slow iteration cycles and poor scalability in large action spaces. To address this, we propose a sequential Bayesian optimization framework that integrates fast and slow online experiments with offline proxy evaluation. Specifically, short-cycle, biased online experiments—combined with off-policy evaluation (OPE)—provide rapid, low signal-to-noise ratio feedback, while long-cycle, unbiased experiments calibrate long-term impact. A unified Bayesian model jointly assimilates asynchronous, heterogeneous observations from these sources, enabling adaptive exploration–exploitation trade-offs. Our approach ensures accurate long-term effect estimation while substantially reducing optimization latency, thereby enhancing both decision-making efficiency and robustness in large-scale internet systems.

Technology Category

Application Category

📝 Abstract

Online experiments in internet systems, also known as A/B tests, are used for a wide range of system tuning problems, such as optimizing recommender system ranking policies and learning adaptive streaming controllers. Decision-makers generally wish to optimize for long-term treatment effects of the system changes, which often requires running experiments for a long time as short-term measurements can be misleading due to non-stationarity in treatment effects over time. The sequential experimentation strategies--which typically involve several iterations--can be prohibitively long in such cases. We describe a novel approach that combines fast experiments (e.g., biased experiments run only for a few hours or days) and/or offline proxies (e.g., off-policy evaluation) with long-running, slow experiments to perform sequential, Bayesian optimization over large action spaces in a short amount of time.

Problem

Research questions and friction points this paper is trying to address.

Optimizing long-term outcomes in online experiments efficiently

Addressing non-stationarity in treatment effects over time

Combining fast and slow experiments for Bayesian optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines fast and slow experiments for optimization

Uses Bayesian optimization over large action spaces

Integrates offline proxies with online experiments

🔎 Similar Papers

No similar papers found.