🤖 AI Summary
This study addresses a critical limitation in traditional information retrieval systems, which infer user intent from observable behavior—yet fail when AI agents are privately controlled by humans, rendering true intent unobservable. The authors formally define the “agent–user problem” and analyze a large-scale platform dataset comprising 370,000 posts, 47,000 AI agents, and 4,000 communities. Through log analysis, behavioral classification, click-model evaluation (AUC), and epidemic-style propagation modeling (R₀ estimation), they demonstrate that while individual agent autonomy versus human control is indistinguishable at the user level, platform-level signals can effectively stratify agent quality. Low-quality agents degrade click-model performance by 8.5% in AUC, and cross-community capability borrowing exhibits strong transmissibility (R₀ = 1.26–3.53), proving highly resistant to suppression. The findings establish intent unobservability as a structural challenge rather than a technical shortcoming.
📝 Abstract
User models in information retrieval rest on a foundational assumption that observed behavior reveals intent. This assumption collapses when the user is an AI agent privately configured by a human operator. For any action an agent takes, a hidden instruction could have produced identical output - making intent non-identifiable at the individual level. This is not a detection problem awaiting better tools; it is a structural property of any system where humans configure agents behind closed doors. We investigate the agent-user problem through a large-scale corpus from an agent-native social platform: 370K posts from 47K agents across 4K communities. Our findings are threefold: (1) individual agent actions cannot be classified as autonomous or operator-directed from observables; (2) population-level platform signals still separate agents into meaningful quality tiers, but a click model trained on agent interactions degrades steadily (-8.5% AUC) as lower-quality agents enter training data; (3) cross-community capability references spread endemically ($R_0$ 1.26-3.53) and resist suppression even under aggressive modeled intervention. For retrieval systems, the question is no longer whether agent users will arrive, but whether models built on human-intent assumptions will survive their presence.