🤖 AI Summary
This work addresses the vulnerability of large language model (LLM) agent systems to natural language attacks that exploit tool access and escalate privileges, particularly in multi-agent settings where novel threats such as “confused deputy” scenarios emerge. The paper presents SEAgent, the first systematic framework modeling such privilege escalation risks through a mandatory access control (MAC) approach that integrates attribute-based access control (ABAC) with information flow control. SEAgent monitors agent–tool interactions via an information flow graph and enforces customizable security policies grounded in entity attributes. Experimental evaluation demonstrates that SEAgent effectively mitigates both known and newly identified privilege abuse attacks with negligible performance overhead, low false-positive rates, and strong robustness, while remaining adaptable to real-world deployment environments.
📝 Abstract
Large Language Model (LLM)-based agent systems are increasingly deployed for complex real-world tasks but remain vulnerable to natural language-based attacks that exploit over-privileged tool use. This paper aims to understand and mitigate such attacks through the lens of privilege escalation, defined as agent actions exceeding the least privilege required for a user's intended task. Based on a formal model of LLM agent systems, we identify novel privilege escalation scenarios, particularly in multi-agent systems, including a variant akin to the classic confused deputy problem. To defend against both known and newly demonstrated privilege escalation, we propose SEAgent, a mandatory access control (MAC) framework built upon attribute-based access control (ABAC). SEAgent monitors agent-tool interactions via an information flow graph and enforces customizable security policies based on entity attributes. Our evaluations show that SEAgent effectively blocks various privilege escalation while maintaining a low false positive rate and negligible system overhead. This demonstrates its robustness and adaptability in securing LLM-based agent systems.