Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
This work addresses the vulnerability of existing multimodal retrieval-augmented generation (RAG) systems in healthcare to adversarial knowledge injection attacks, noting that prior defenses rely on unrealistic assumptions about user query priors. The authors propose M³Att, a novel framework that enables multimodal knowledge poisoning attacks using only knowledge of the database distribution and without requiring any query prior. M³Att injects subtle medical inaccuracies into textual entries and leverages query-agnostic paired images as triggers to increase the retrieval likelihood of malicious content. By combining imperceptible visual perturbations with inherent ambiguities in medical diagnosis, the attack effectively bypasses the self-correction mechanisms of large language models. Experiments across five prominent LLMs and medical datasets demonstrate that M³Att consistently generates clinically plausible yet factually incorrect diagnoses, significantly undermining system reliability.
📝 Abstract
Retrieval-augmented generation (RAG) is a widely adopted paradigm for enhancing LLMs in medical applications by incorporating expert multimodal knowledge during generation. However, the underlying retrieval databases may naturally contain, or be intentionally injected with, adversarial knowledge, which can perturb model outputs and undermine system reliability. To investigate this risk, prior studies have explored knowledge poisoning attacks in medical RAG systems. Nevertheless, most of them rely on the strong assumption that adversaries possess prior knowledge of user queries, which is unrealistic in deployments and substantially limits their practical applicability. In this paper, we propose M\textsuperscript{3}Att, a knowledge-poisoning framework designed for medical multimodal RAG systems, assuming only limited distribution knowledge of the underlying database. Our core idea is to inject covert misinformation into textual data while using paired visual data as a query-agnostic trigger to promote retrieval. We first propose a unified framework that introduces imperceptible perturbations to visual inputs to manipulate retrieval probabilities. Besides, due to the prior medical knowledge in LLMs, naively poisoned medical content with explicit factual errors can be corrected during generation. Thus, we leverage the inherent ambiguity of medical diagnosis and design a covert misinformation injection strategy that degrades diagnostic accuracy while evading model self-correction. Experiments on five LLMs and datasets demonstrate that M\textsuperscript{3}Att consistently produces clinically plausible yet incorrect generations. Codes: https://github.com/ypr17/M3Att.
Problem

Research questions and friction points this paper is trying to address.

Knowledge Poisoning
Medical RAG
Multi-Modal Retrieval
Adversarial Attack
Retrieval-Augmented Generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge poisoning
multimodal RAG
medical misinformation
query-agnostic attack
retrieval manipulation