🤖 AI Summary
This study investigates security vulnerabilities in multimodal medical retrieval-augmented generation (RAG) systems. We propose, for the first time, a cross-modal conflict injection attack paradigm: adversarial image-text pairs with semantically contradictory content are crafted to disrupt collaborative reasoning between vision-language models (VLMs) and the RAG module. Leveraging our novel multimodal poisoning framework, MedThreatRAG—which integrates adversarial sample generation, retrieval interference, and generation manipulation—we conduct systematic attacks on the IU-Xray and MIMIC-CXR datasets. Experiments demonstrate that the attack reduces answer F1 scores by up to 27.66%, degrading LLaVA-Med-1.5’s performance to 51.36%, thereby exposing fundamental security flaws in medical RAG under realistic model-update scenarios. Furthermore, we distill actionable clinical AI safety guidelines, providing both theoretical foundations and practical pathways toward trustworthy multimodal medical AI.
📝 Abstract
Large Vision-Language Models (LVLMs) augmented with Retrieval-Augmented Generation (RAG) are increasingly employed in medical AI to enhance factual grounding through external clinical image-text retrieval. However, this reliance creates a significant attack surface. We propose MedThreatRAG, a novel multimodal poisoning framework that systematically probes vulnerabilities in medical RAG systems by injecting adversarial image-text pairs. A key innovation of our approach is the construction of a simulated semi-open attack environment, mimicking real-world medical systems that permit periodic knowledge base updates via user or pipeline contributions. Within this setting, we introduce and emphasize Cross-Modal Conflict Injection (CMCI), which embeds subtle semantic contradictions between medical images and their paired reports. These mismatches degrade retrieval and generation by disrupting cross-modal alignment while remaining sufficiently plausible to evade conventional filters. While basic textual and visual attacks are included for completeness, CMCI demonstrates the most severe degradation. Evaluations on IU-Xray and MIMIC-CXR QA tasks show that MedThreatRAG reduces answer F1 scores by up to 27.66% and lowers LLaVA-Med-1.5 F1 rates to as low as 51.36%. Our findings expose fundamental security gaps in clinical RAG systems and highlight the urgent need for threat-aware design and robust multimodal consistency checks. Finally, we conclude with a concise set of guidelines to inform the safe development of future multimodal medical RAG systems.