🤖 AI Summary
Accurately estimating the causal effects of long-term product interventions—such as UI redesigns or recommendation algorithm updates—in digital platforms remains challenging, as conventional short-term A/B tests fail to capture delayed and evolving impacts. To address this, we propose the first causal inference framework specifically designed for estimating long-term treatment effects. Our approach disentangles time-varying confounding from lagged treatment effects by explicitly modeling treatment duration as a key covariate. It integrates structural time-series modeling, doubly robust estimation, and dynamic causal graphs to enable counterfactual effect estimation without requiring costly long-duration experiments. Evaluated on real-world platform data, our method reduces long-term effect estimation error by 42% and achieves high-fidelity predictions across core metrics—including user retention rate and click-through rate—thereby significantly improving both the reliability and efficiency of long-horizon strategy evaluation.
📝 Abstract
Randomized controlled trials (RCTs), also known as A/B tests, have become the gold standard for evaluating the effectiveness of product changes on digital platforms. Accurately estimating the effects of long-term treatments still remains a challenge. Product updates such as new user interfaces or recommendation algorithms are intended to persist in the system for an extended period. However, A/B testing is typically conducted for short durations, often less than two weeks, to facilitate rapid product iterations. Conducting lengthy experiments to capture the long-term impact of product changes becomes impractical due to potential negative impacts on user experiences, high opportunity costs associated with user traffic, and delays in decision-making processes.