Don't believe everything you read: Understanding and Measuring MCP Behavior under Misleading Tool Descriptions

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses a novel security threat arising from inconsistencies between tool descriptions and their actual code implementations in the Model Context Protocol (MCP) ecosystem, which may mislead AI agents into performing unauthorized or high-risk operations. We present the first large-scale analysis of description-code consistency across 10,240 MCP servers, leveraging an automated static analysis framework to systematically evaluate the impact of such discrepancies on AI decision-making. Our findings reveal that approximately 13% of servers exhibit severe inconsistencies capable of enabling unauthorized financial transactions, covert state manipulation, and other critical risks. Furthermore, we uncover significant variations in the prevalence and nature of these inconsistencies across different tool categories and deployment platforms, establishing description-code divergence as a widespread and hazardous emerging attack surface in AI-integrated systems.

Technology Category

Application Category

📝 Abstract
The Model Context Protocol (MCP) enables large language models to invoke external tools through natural-language descriptions, forming the foundation of many AI agent applications. However, MCP does not enforce consistency between documented tool behavior and actual code execution, even though MCP Servers often run with broad system privileges. This gap introduces a largely unexplored security risk. We study how mismatches between externally presented tool descriptions and underlying implementations systematically shape the mental models and decision-making behavior of intelligent agents. Specifically, we present the first large-scale study of description-code inconsistency in the MCP ecosystem. We design an automated static analysis framework and apply it to 10,240 real-world MCP Servers across 36 categories. Our results show that while most servers are highly consistent, approximately 13% exhibit substantial mismatches that can enable undocumented privileged operations, hidden state mutations, or unauthorized financial actions. We further observe systematic differences across application categories, popularity levels, and MCP marketplaces. Our findings demonstrate that description-code inconsistency is a concrete and prevalent attack surface in MCP-based AI agents, and motivate the need for systematic auditing and stronger transparency guarantees in future agent ecosystems.
Problem

Research questions and friction points this paper is trying to address.

Model Context Protocol
tool description inconsistency
AI agent security
description-code mismatch
privileged operations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model Context Protocol
description-code inconsistency
static analysis
AI agent security
tool misuse
🔎 Similar Papers
No similar papers found.
Zhihao Li
Zhihao Li
The Hong Kong University of Science and Technology (Guangzhou)
AI for ScienceAI for PDEGraph Neural Networks
B
Boyang Ma
School of Computer Science and Technology, Shandong University, Qingdao, China
X
Xuelong Dai
School of Computer Science and Technology, Shandong University, Qingdao, China
M
Minghui Xu
School of Computer Science and Technology, Shandong University, Qingdao, China
Y
Yue Zhang
School of Computer Science and Technology, Shandong University, Qingdao, China
B
Biwei Yan
School of Computer Science and Technology, Shandong University, Qingdao, China
Kun Li
Kun Li
Institute of Information Engineering, Chinese Academy of Sciences, China