Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation

📅 2025-03-08

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work introduces the first knowledge poisoning attack targeting multimodal retrieval-augmented generation (MRAG) systems, aiming to induce vision-language models to generate targeted erroneous responses on specific queries by injecting a minimal number of adversarial image-text pairs into the multimodal knowledge base. The authors propose a novel MRAG knowledge poisoning framework featuring two cross-modal attack strategies—dirty-label and clean-label—integrated with optimization-based adversarial image-text construction, joint embedding perturbation, and knowledge-base-level injection. Evaluated on the InfoSeek dataset (480K+ samples), injecting merely five poisoned samples achieves a 98% attack success rate. Four representative defenses—query rewriting, deduplication, structural mitigation, and purification—demonstrate limited efficacy, confirming the attack’s high effectiveness, stealthiness, and scalability. This study exposes a critical security vulnerability in the knowledge injection phase of MRAG systems.

Technology Category

Application Category

📝 Abstract

Multimodal retrieval-augmented generation (RAG) enhances the visual reasoning capability of vision-language models (VLMs) by dynamically accessing information from external knowledge bases. In this work, we introduce extit{Poisoned-MRAG}, the first knowledge poisoning attack on multimodal RAG systems. Poisoned-MRAG injects a few carefully crafted image-text pairs into the multimodal knowledge database, manipulating VLMs to generate the attacker-desired response to a target query. Specifically, we formalize the attack as an optimization problem and propose two cross-modal attack strategies, dirty-label and clean-label, tailored to the attacker's knowledge and goals. Our extensive experiments across multiple knowledge databases and VLMs show that Poisoned-MRAG outperforms existing methods, achieving up to 98% attack success rate with just five malicious image-text pairs injected into the InfoSeek database (481,782 pairs). Additionally, We evaluate 4 different defense strategies, including paraphrasing, duplicate removal, structure-driven mitigation, and purification, demonstrating their limited effectiveness and trade-offs against Poisoned-MRAG. Our results highlight the effectiveness and scalability of Poisoned-MRAG, underscoring its potential as a significant threat to multimodal RAG systems.

Problem

Research questions and friction points this paper is trying to address.

Introduces Poisoned-MRAG, a knowledge poisoning attack on multimodal RAG systems.

Demonstrates high attack success rates with minimal malicious data injection.

Evaluates and highlights limitations of existing defense strategies against Poisoned-MRAG.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Poisoned-MRAG for multimodal RAG attacks

Uses crafted image-text pairs to manipulate VLMs

Proposes cross-modal dirty-label and clean-label strategies

🔎 Similar Papers

On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains