SoK: Trust-Authorization Mismatch in LLM Agent Interactions

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Large language models (LLMs) are evolving into autonomous agents capable of interacting with external environments; however, their natural-language-driven, probabilistic decision-making undermines traditional deterministic-logic-based security mechanisms, creating a fundamental mismatch between trust assessment and authorization policies—and thereby introducing novel security risks. Method: This paper proposes the first risk analysis model centered on the “trust–authorization gap,” formally modeling permission control for LLM agents in open-ended interactions, systematically classifying existing attacks and defenses, and identifying critical research gaps. Contributions: (1) A unified analytical framework for agent interaction security that integrates fragmented academic insights; (2) Foundational theoretical principles for dynamic authorization and trustworthy agency; and (3) A methodology enabling principled implementation of the least-privilege principle under AI behavioral uncertainty.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are rapidly evolving into autonomous agents capable of interacting with the external world, significantly expanding their capabilities through standardized interaction protocols. However, this paradigm revives the classic cybersecurity challenges of agency and authorization in a novel and volatile context. As decision-making shifts from deterministic code logic to probabilistic inference driven by natural language, traditional security mechanisms designed for deterministic behavior fail. It is fundamentally challenging to establish trust for unpredictable AI agents and to enforce the Principle of Least Privilege (PoLP) when instructions are ambiguous. Despite the escalating threat landscape, the academic community's understanding of this emerging domain remains fragmented, lacking a systematic framework to analyze its root causes. This paper provides a unifying formal lens for agent-interaction security. We observed that most security threats in this domain stem from a fundamental mismatch between trust evaluation and authorization policies. We introduce a novel risk analysis model centered on this trust-authorization gap. Using this model as a unifying lens, we survey and classify the implementation paths of existing, often seemingly isolated, attacks and defenses. This new framework not only unifies the field but also allows us to identify critical research gaps. Finally, we leverage our analysis to suggest a systematic research direction toward building robust, trusted agents and dynamic authorization mechanisms.

Problem

Research questions and friction points this paper is trying to address.

Analyzes trust-authorization mismatch in LLM agent interactions.

Addresses security challenges from probabilistic AI decision-making.

Unifies fragmented understanding of agent-interaction security threats.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Formal framework for agent-interaction security analysis

Risk model centered on trust-authorization mismatch

Unified classification of attacks and defenses

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies