MPMA: Preference Manipulation Attack Against Model Context Protocol

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This paper identifies a novel security threat—MCP Preference Manipulation Attack (MPMA)—within the open ecosystem of Model Context Protocol (MCP), wherein malicious third-party servers manipulate large language models (LLMs) into preferentially invoking their services to extract economic gain (e.g., paid service revenue or ad impressions). We formally define MPMA for the first time and propose two attack variants: Direct Preference Manipulation Attack (DPMA) and Genetic Algorithm-based Stealthy Attack (GAPMA). GAPMA integrates tool description injection, multi-strategy ad initialization, and preference modeling to achieve high success rates (>92%) while significantly reducing detectability. Empirical evaluation demonstrates that MPMA severely undermines fairness and trustworthiness in the MCP ecosystem. Our work establishes a foundational problem formulation, provides a rigorous threat model, and introduces a benchmark framework for evaluating future defense mechanisms against preference manipulation in MCP-enabled systems.

Technology Category

Application Category

📝 Abstract

Model Context Protocol (MCP) standardizes interface mapping for large language models (LLMs) to access external data and tools, which revolutionizes the paradigm of tool selection and facilitates the rapid expansion of the LLM agent tool ecosystem. However, as the MCP is increasingly adopted, third-party customized versions of the MCP server expose potential security vulnerabilities. In this paper, we first introduce a novel security threat, which we term the MCP Preference Manipulation Attack (MPMA). An attacker deploys a customized MCP server to manipulate LLMs, causing them to prioritize it over other competing MCP servers. This can result in economic benefits for attackers, such as revenue from paid MCP services or advertising income generated from free servers. To achieve MPMA, we first design a Direct Preference Manipulation Attack ($mathtt{DPMA}$) that achieves significant effectiveness by inserting the manipulative word and phrases into the tool name and description. However, such a direct modification is obvious to users and lacks stealthiness. To address these limitations, we further propose Genetic-based Advertising Preference Manipulation Attack ($mathtt{GAPMA}$). $mathtt{GAPMA}$ employs four commonly used strategies to initialize descriptions and integrates a Genetic Algorithm (GA) to enhance stealthiness. The experiment results demonstrate that $mathtt{GAPMA}$ balances high effectiveness and stealthiness. Our study reveals a critical vulnerability of the MCP in open ecosystems, highlighting an urgent need for robust defense mechanisms to ensure the fairness of the MCP ecosystem.

Problem

Research questions and friction points this paper is trying to address.

MPMA exploits MCP vulnerabilities to manipulate LLM preferences

DPMA modifies tool names for economic gain but lacks stealth

GAPMA uses genetic algorithms to enhance attack stealthiness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Direct Preference Manipulation Attack (DPMA) modifies tool names

Genetic Algorithm enhances stealthiness in GAPMA attack

GAPMA balances effectiveness and stealthiness using GA

🔎 Similar Papers

FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering