Securing AI Agents with Information-Flow Control

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
As AI agents gain increasing autonomy, security threats—particularly prompt injection—pose growing risks to agent integrity and confidentiality. Method: This paper introduces Fides, the first information-flow control (IFC) framework specifically designed for AI agent planners. It integrates dynamic taint tracking, confidentiality/integrity labels, and policy-enforcement mechanisms. Its core innovations include: (1) a formal IFC model tailored to planning processes; (2) security primitives enabling selective information hiding; and (3) a task taxonomy jointly optimizing security guarantees and functional utility. Contribution/Results: Implemented as an open-source secure planner, Fides significantly expands the set of tasks safely executable under strong, formal security guarantees—demonstrated via rigorous evaluation on the AgentDojo benchmark—while maintaining practical performance and usability.

Technology Category

Application Category

📝 Abstract
As AI agents become increasingly autonomous and capable, ensuring their security against vulnerabilities such as prompt injection becomes critical. This paper explores the use of information-flow control (IFC) to provide security guarantees for AI agents. We present a formal model to reason about the security and expressiveness of agent planners. Using this model, we characterize the class of properties enforceable by dynamic taint-tracking and construct a taxonomy of tasks to evaluate security and utility trade-offs of planner designs. Informed by this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selectively hiding information. Its evaluation in AgentDojo demonstrates that this approach broadens the range of tasks that can be securely accomplished. A tutorial to walk readers through the the concepts introduced in the paper can be found at https://github.com/microsoft/fides
Problem

Research questions and friction points this paper is trying to address.

Securing AI agents against vulnerabilities like prompt injection
Using information-flow control for AI agent security guarantees
Evaluating security and utility trade-offs in planner designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses information-flow control for AI security
Introduces Fides planner with confidentiality labels
Evaluates security-utility trade-offs in AgentDojo
🔎 Similar Papers
No similar papers found.
Manuel Costa
Manuel Costa
Microsoft Research
Operating SystemsNetworksProgramming LanguagesSecurity
B
Boris Kopf
Microsoft
A
Aashish Kolluri
Microsoft
Andrew Paverd
Andrew Paverd
Microsoft
SecurityPrivacy
M
M. Russinovich
Microsoft
A
Ahmed Salem
Microsoft
Shruti Tople
Shruti Tople
Azure Research, Microsoft
Systems and Security
Lukas Wutschitz
Lukas Wutschitz
Microsoft
S
Santiago Zanella-B'eguelin
Microsoft