🤖 AI Summary
This work addresses the challenge of counterfactual policy evaluation in short-video platforms, where human-AI closed-loop feedback complicates accurate simulation of long-horizon and distributional outcomes. To tackle this, the authors propose a modular digital twin system composed of four core modules—user, content, interaction, and platform—integrated with an event-driven execution layer and plug-and-play policy components to enable reproducible, high-fidelity simulations. A key innovation is the introduction of a unified optimizer routing mechanism that selectively incorporates large language models as constrained decision services for tasks such as persona generation, content annotation, and trend forecasting. This approach preserves realistic closed-loop dynamics while enabling efficient, scalable evaluation of AI-driven policies under authentic feedback mechanisms and operational constraints.
📝 Abstract
Short-video platforms are closed-loop, human-in-the-loop ecosystems where platform policy, creator incentives, and user behavior co-evolve. This feedback structure makes counterfactual policy evaluation difficult in production, especially for long-horizon and distributional outcomes. The challenge is amplified as platforms deploy AI tools that change what content enters the system, how agents adapt, and how the platform operates. We propose a large language model (LLM)-augmented digital twin for short-video platforms, with a modular four-twin architecture (User, Content, Interaction, Platform) and an event-driven execution layer that supports reproducible experimentation. Platform policies are implemented as pluggable components within the Platform Twin, and LLMs are integrated as optional, schema-constrained decision services (e.g., persona generation, content captioning, campaign planning, trend prediction) that are routed through a unified optimizer. This design enables scalable simulations that preserve closed-loop dynamics while allowing selective LLM adoption, enabling the study of platform policies, including AI-enabled policies, under realistic feedback and constraints.