π€ AI Summary
Autonomous AI agents pose significant security risks when executing complex tasks due to their ability to autonomously acquire information and execute code, yet existing approaches lack fundamental safeguards. This work proposes the first formal, dynamically adaptive security framework specifically designed for AI agents. Operating under the worst-case adversarial threat model, the framework enforces fine-grained security policies that dynamically constrain agent behavior. Enforcement is achieved through a user-space kernel combined with eBPF-based system call interception, ensuring strict compliance with predefined policies regardless of the agentβs internal design. By bridging the gap between theoretical policy specification and runtime enforcement, the framework provides end-to-end security guarantees, effectively closing the loop between policy intent and actual execution.
π Abstract
Autonomous AI agents powered by Large Language Models can reason, plan, and execute complex tasks, but their ability to autonomously retrieve information and run code introduces significant security risks. Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. We present ClawLess, a security framework that enforces formally verified policies on AI agents under a worst-case threat model where the agent itself may be adversarial. ClawLess formalizes a fine-grained security model over system entities, trust scopes, and permissions to express dynamic policies that adapt to agents' runtime behavior. These policies are translated into concrete security rules and enforced through a user-space kernel augmented with BPF-based syscall interception. This approach bridges the formal security model with practical enforcement, ensuring security regardless of the agent's internal design.