🤖 AI Summary
Current research on the safety of large language model (LLM) agents disproportionately emphasizes algorithmic aspects while overlooking the critical role of human–agent interaction in security decision-making, resulting in a trade-off between robust safeguards and user cognitive load. Through a systematic analysis of 59 academic papers, 21 industrial systems, and 26 security plugins, this work reveals a pronounced disconnect between academia and industry: human-centric mechanisms—such as policy specification, runtime approval, and scope configuration—dominate real-world practice yet remain underexplored in research. To bridge this gap, we reframe LLM agent safety as an Agent–Human Interaction (AHI) problem, advocate for establishing AHI safety as a distinct research direction, and propose corresponding design principles and an evaluation framework to elevate it as a first-class concern in AI safety.
📝 Abstract
We argue that LLM agent security is fundamentally an agent-human interaction (AHI) problem, not a purely algorithmic one. To substantiate this position, we conduct a systematic analysis of 59 academic papers, 21 production agent systems, and 26 security plugins as of April 2026. Our analysis reveals a striking pattern: the three widely deployed human-centric security mechanisms (policy specification, runtime approval, and scope configuration) dominate industry practice, each adopted by at least 14 of 21 systems (14, 15, and 16, respectively), while the categories most heavily studied in academia (intent anchoring and trust labeling) see zero production deployment. Yet current human participation mechanisms are far from satisfactory: they suffer from a fundamental trade-off between cognitive burden and security guarantees, leaving users caught between approval fatigue and uncontrolled agent autonomy. We make three contributions. First, through a systematic comparison of LLM-based and human-based intent alignment, we argue that human participation in agent security decisions is indispensable given current capabilities. Second, we quantify a pronounced industry-academia mismatch: the security mechanisms that practitioners actually deploy receive scant research attention, while the approaches that researchers favor remain undeployed. Third, we propose a three-direction research agenda and call for AHI security to be recognized as a first-class research citizen, one that demands its own design principles, evaluation methods, and theoretical foundations.