MindGuard: Tracking, Detecting, and Attributing MCP Tool Poisoning Attack via Decision Dependence Graph

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Tool metadata poisoning attacks (TPAs) in Model Context Protocol (MCP) manipulate LLM agents by corrupting tool descriptions—inducing unauthorized actions without actual tool invocation, thereby evading conventional behavior-level defenses. Method: We propose a decision-level defense paradigm that first links LLM attention mechanisms to decision logic, constructing a Decision Dependency Graph (DDG) for pre-invocation provenance tracing and attack source attribution. Our approach integrates attention-driven decision tracking, DDG modeling, graph-structural anomaly detection, and secure policy transfer from Program Dependency Graphs (PDGs). Contribution/Results: Evaluated on real-world datasets, our method achieves 94–99% average detection accuracy and 95–100% attribution accuracy, with inference latency under 1 second and zero additional token overhead. This is the first work to leverage attention dynamics for decision-level TPA detection and root-cause attribution in MCP-based agent systems.

Technology Category

Application Category

📝 Abstract
The Model Context Protocol (MCP) is increasingly adopted to standardize the interaction between LLM agents and external tools. However, this trend introduces a new threat: Tool Poisoning Attacks (TPA), where tool metadata is poisoned to induce the agent to perform unauthorized operations. Existing defenses that primarily focus on behavior-level analysis are fundamentally ineffective against TPA, as poisoned tools need not be executed, leaving no behavioral trace to monitor. Thus, we propose MindGuard, a decision-level guardrail for LLM agents, providing provenance tracking of call decisions, policy-agnostic detection, and poisoning source attribution against TPA. While fully explaining LLM decision remains challenging, our empirical findings uncover a strong correlation between LLM attention mechanisms and tool invocation decisions. Therefore, we choose attention as an empirical signal for decision tracking and formalize this as the Decision Dependence Graph (DDG), which models the LLM's reasoning process as a weighted, directed graph where vertices represent logical concepts and edges quantify the attention-based dependencies. We further design robust DDG construction and graph-based anomaly analysis mechanisms that efficiently detect and attribute TPA attacks. Extensive experiments on real-world datasets demonstrate that MindGuard achieves 94%-99% average precision in detecting poisoned invocations, 95%-100% attribution accuracy, with processing times under one second and no additional token cost. Moreover, DDG can be viewed as an adaptation of the classical Program Dependence Graph (PDG), providing a solid foundation for applying traditional security policies at the decision level.
Problem

Research questions and friction points this paper is trying to address.

Detecting tool poisoning attacks in LLM agent interactions
Tracking decision provenance without behavioral traces
Attributing poisoning sources via attention-based dependency graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decision Dependence Graph models LLM reasoning
Attention-based dependencies track tool decisions
Graph anomaly analysis detects poisoning attacks
🔎 Similar Papers
No similar papers found.
Z
Zhiqiang Wang
University of Science and Technology of China
Junyang Zhang
Junyang Zhang
California Institute of Technology, Stanford University, University of California, Irvine
machine learning and ML systemroboticsdigital designsemiconductorintegrated circuits
G
Guanquan Shi
Beihang University
H
HaoRan Cheng
University of Science and Technology of China
Y
Yunhao Yao
University of Science and Technology of China
Kaiwen Guo
Kaiwen Guo
Synthesia
computer visionmachine learningcomputer graphics
H
Haohua Du
Beihang University
X
Xiang-Yang Li
University of Science and Technology of China