From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Existing research on MCP security relies solely on classifying attack surface manifestations, overlooking component-level behaviors and multi-component collaborative attack chains, thereby struggling to defend against emerging composite attacks. This work proposes the first component-centric framework for analyzing malicious MCPs, introducing a proof-of-concept dataset comprising 114 malicious MCP servers and establishing a component-level attack model that reveals how component placement and composition influence attack success rates. Building upon this foundation, we design Connor, a two-stage behavioral anomaly detector that integrates pre-execution static analysis—identifying malicious shell commands and extracting functional intent—with runtime dynamic monitoring of behavioral trajectory deviations. Experimental results demonstrate that Connor achieves an F1-score of 94.6% on our dataset, significantly outperforming existing methods by 8.9%–59.6%, and successfully detects two real-world malicious MCP servers.

Technology Category

Application Category

📝 Abstract

The model context protocol (MCP) standardizes how LLMs connect to external tools and data sources, enabling faster integration but introducing new attack vectors. Despite the growing adoption of MCP, existing MCP security studies classify attacks by their observable effects, obscuring how attacks behave across different MCP server components and overlooking multi-component attack chains. Meanwhile, existing defenses are less effective when facing multi-component attacks or previously unknown malicious behaviors. This work presents a component-centric perspective for understanding and detecting malicious MCP servers. First, we build the first component-centric PoC dataset of 114 malicious MCP servers where attacks are achieved as manipulation over MCP components and their compositions. We evaluate these attacks' effectiveness across two MCP hosts and five LLMs, and uncover that (1) component position shapes attack success rate; and (2) multi-component compositions often outperform single-component attacks by distributing malicious logic. Second, we propose and implement Connor, a two-stage behavioral deviation detector for malicious MCP servers. It first performs pre-execution analysis to detect malicious shell commands and extract each tool's function intent, and then conducts step-wise in-execution analysis to trace each tool's behavioral trajectories and detect deviations from its function intent. Evaluation on our curated dataset indicates that Connor achieves an F1-score of 94.6%, outperforming the state of the art by 8.9% to 59.6%. In real-world detection, Connor identifies two malicious servers.

Problem

Research questions and friction points this paper is trying to address.

MCP security

multi-component attacks

malicious behavior detection

attack chains

LLM tool integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Model Context Protocol

component-centric analysis

multi-component attack