🤖 AI Summary
This study addresses a critical limitation in using large language models (LLMs) to simulate interventional experiments: such simulations inherently constitute observational studies and are thus susceptible to intervention-induced shifts in user attributes—termed “user drift”—stemming from the observational nature of training data, which biases causal effect estimation. The work formally characterizes this problem for the first time and proposes a novel approach that leverages negative control outcomes to diagnose user drift. To mitigate the resulting bias, it explicitly incorporates key confounders through role-based prompt engineering. Empirical evaluations in both survey-style and multi-turn dialogue settings demonstrate that the proposed method substantially reduces estimation bias and significantly enhances the reliability of causal inference in LLM-simulated experiments across diverse scenarios.
📝 Abstract
Large language models (LLMs) show potential as simulators of human behavior, offering a scalable way to study responses to interventions. However, because LLMs are trained largely on observational data, interventions in experiments with LLM-simulated synthetic users can induce unintended shifts in latent user attributes, causing user drift where the implicit simulated population differs across treatment conditions, potentially distorting effect estimates. We formalize the confounding or selection bias that can arise due to user drift and show how intervention-dependent shifts can inflate or attenuate observed differences in user responses under intervention. To diagnose confounding, we propose using negative control outcomes--attributes that should remain invariant under intervention--to identify distribution shifts across intervention conditions, providing evidence of user drift. To mitigate drift, we study adjusting the persona specification by eliciting additional confounders, finding that targeted, setting-relevant confounders can substantially reduce bias across survey-style and multi-turn agent evaluations.