🤖 AI Summary
This paper identifies and formalizes a novel privacy threat—“intent reverse-engineering attacks”—in Large Language Model (LLM) agent architectures based on the Model Context Protocol (MCP). In such systems, a semi-honest third-party MCP server can accurately reconstruct users’ private latent intents by analyzing legitimate tool invocation logs, including tool purposes, invocation statements, and returned results. To mitigate this threat, we propose a Hierarchical Information Isolation and Three-Dimensional Semantic Analysis framework that enables precise intent reconstruction at the step level. Experimental results across multiple scenarios demonstrate semantic alignment exceeding 85%, significantly outperforming baseline methods. This work is the first to rigorously establish tool interaction logs as high-risk privacy leakage vectors. It provides critical security insights for trustworthy LLM agent design and introduces a practical, deployable defense paradigm grounded in semantic-aware log sanitization and intent obfuscation.
📝 Abstract
The rapid evolution of Large Language Models (LLMs) into autonomous agents has led to the adoption of the Model Context Protocol (MCP) as a standard for discovering and invoking external tools. While this architecture decouples the reasoning engine from tool execution to enhance scalability, it introduces a significant privacy surface: third-party MCP servers, acting as semi-honest intermediaries, can observe detailed tool interaction logs outside the user's trusted boundary. In this paper, we first identify and formalize a novel privacy threat termed Intent Inversion, where a semi-honest MCP server attempts to reconstruct the user's private underlying intent solely by analyzing legitimate tool calls. To systematically assess this vulnerability, we propose IntentMiner, a framework that leverages Hierarchical Information Isolation and Three-Dimensional Semantic Analysis, integrating tool purpose, call statements, and returned results, to accurately infer user intent at the step level. Extensive experiments demonstrate that IntentMiner achieves a high degree of semantic alignment (over 85%) with original user queries, significantly outperforming baseline approaches. These results highlight the inherent privacy risks in decoupled agent architectures, revealing that seemingly benign tool execution logs can serve as a potent vector for exposing user secrets.