Securing AI Agents with Information-Flow Control

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

As AI agents gain increasing autonomy, security threats—particularly prompt injection—pose growing risks to agent integrity and confidentiality. Method: This paper introduces Fides, the first information-flow control (IFC) framework specifically designed for AI agent planners. It integrates dynamic taint tracking, confidentiality/integrity labels, and policy-enforcement mechanisms. Its core innovations include: (1) a formal IFC model tailored to planning processes; (2) security primitives enabling selective information hiding; and (3) a task taxonomy jointly optimizing security guarantees and functional utility. Contribution/Results: Implemented as an open-source secure planner, Fides significantly expands the set of tasks safely executable under strong, formal security guarantees—demonstrated via rigorous evaluation on the AgentDojo benchmark—while maintaining practical performance and usability.

Technology Category

Application Category

📝 Abstract

As AI agents become increasingly autonomous and capable, ensuring their security against vulnerabilities such as prompt injection becomes critical. This paper explores the use of information-flow control (IFC) to provide security guarantees for AI agents. We present a formal model to reason about the security and expressiveness of agent planners. Using this model, we characterize the class of properties enforceable by dynamic taint-tracking and construct a taxonomy of tasks to evaluate security and utility trade-offs of planner designs. Informed by this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selectively hiding information. Its evaluation in AgentDojo demonstrates that this approach broadens the range of tasks that can be securely accomplished. A tutorial to walk readers through the the concepts introduced in the paper can be found at https://github.com/microsoft/fides

Problem

Research questions and friction points this paper is trying to address.

Securing AI agents against vulnerabilities like prompt injection

Using information-flow control for AI agent security guarantees

Evaluating security and utility trade-offs in planner designs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses information-flow control for AI security

Introduces Fides planner with confidentiality labels

Evaluates security-utility trade-offs in AgentDojo

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies