Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

📅 2026-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes MM-MEPA, a novel attack method that exploits vulnerabilities in multimodal retrieval-augmented generation (RAG) systems through metadata poisoning. Unlike conventional approaches that require modifications to primary visual or textual content, MM-MEPA demonstrates that solely manipulating the metadata of image-text entries—without altering the visual content itself—is sufficient to effectively hijack multimodal retrieval results and steer the generator toward producing attacker-specified outputs. By leveraging the coupling between the retriever and the generator, the attack achieves end-to-end targeted manipulation, attaining up to 91% success rates across four mainstream retrievers and two multimodal generative models. Moreover, MM-MEPA exhibits strong evasion capabilities against prevailing defense mechanisms, thereby challenging the traditional attack paradigm that relies on direct content modification.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) has emerged as a powerful paradigm for enhancing multimodal large language models by grounding their responses in external, factual knowledge and thus mitigating hallucinations. However, the integration of externally sourced knowledge bases introduces a critical attack surface. Adversaries can inject malicious multimodal content capable of influencing both retrieval and downstream generation. In this work, we present MM-MEPA, a multimodal poisoning attack that targets the metadata components of image-text entries while leaving the associated visual content unaltered. By only manipulating the metadata, MM-MEPA can still steer multimodal retrieval and induce attacker-desired model responses. We evaluate the attack across multiple benchmark settings and demonstrate its severity. MM-MEPA achieves an attack success rate of up to 91\% consistently disrupting system behaviors across four retrievers and two multimodal generators. Additionally, we assess representative defense strategies and find them largely ineffective against this form of metadata-only poisoning. Our findings expose a critical vulnerability in multimodal RAG and underscore the urgent need for more robust, defense-aware retrieval and knowledge integration methods.
Problem

Research questions and friction points this paper is trying to address.

multimodal retrieval-augmented generation
stealth poisoning attacks
metadata manipulation
adversarial attacks
hallucination mitigation
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal poisoning
metadata manipulation
retrieval-augmented generation
stealth attack
adversarial robustness
🔎 Similar Papers
No similar papers found.
Kennedy Edemacu
Kennedy Edemacu
Muni University, University of Arkansas
M2M communicationPrivacy and SecurityMachine Learning
M
Mohammad Mahdi Shokri
The City University of New York, Graduate Center, New York, NY 10016, USA