🤖 AI Summary
This work identifies novel security risks introduced by the Model Context Protocol (MCP) client-server architecture. We systematically investigate these threats by designing malicious MCP servers, developing a proof-of-concept attack framework, and conducting empirical deployments across three major MCP aggregation platforms. Our study is the first to identify and validate four distinct attack vectors: tool poisoning, puppeteering, rug-pull service withdrawal, and exploitation of malicious external resources. Experiments demonstrate that five mainstream large language models (LLMs) can be induced—via seemingly benign MCP services—to perform sensitive local operations, including reading private files and transferring digital assets. Crucially, existing platform review mechanisms fail to detect such threats, and users lack effective means to identify malicious services. We further distill four fundamental security challenges confronting the current MCP ecosystem. These findings provide critical empirical evidence and concrete guidance for securing LLM-based agents, refining MCP protocol design, and strengthening platform governance.
📝 Abstract
The Model Context Protocol (MCP) is an emerging standard designed to enable seamless interaction between Large Language Model (LLM) applications and external tools or resources. Within a short period, thousands of MCP services have already been developed and deployed. However, the client-server integration architecture inherent in MCP may expand the attack surface against LLM Agent systems, introducing new vulnerabilities that allow attackers to exploit by designing malicious MCP servers. In this paper, we present the first systematic study of attack vectors targeting the MCP ecosystem. Our analysis identifies four categories of attacks, i.e., Tool Poisoning Attacks, Puppet Attacks, Rug Pull Attacks, and Exploitation via Malicious External Resources. To evaluate the feasibility of these attacks, we conduct experiments following the typical steps of launching an attack through malicious MCP servers: upload-download-attack. Specifically, we first construct malicious MCP servers and successfully upload them to three widely used MCP aggregation platforms. The results indicate that current audit mechanisms are insufficient to identify and prevent the proposed attack methods. Next, through a user study and interview with 20 participants, we demonstrate that users struggle to identify malicious MCP servers and often unknowingly install them from aggregator platforms. Finally, we demonstrate that these attacks can trigger harmful behaviors within the user's local environment-such as accessing private files or controlling devices to transfer digital assets-by deploying a proof-of-concept (PoC) framework against five leading LLMs. Additionally, based on interview results, we discuss four key challenges faced by the current security ecosystem surrounding MCP servers. These findings underscore the urgent need for robust security mechanisms to defend against malicious MCP servers.