🤖 AI Summary
The lack of systematic evaluation of poisoning attacks against Retrieval-Augmented Generation (RAG) systems hinders robustness assessment and defense development.
Method: This paper introduces the first comprehensive benchmark framework for RAG poisoning attacks, encompassing five standard QA datasets, ten semantic/distributional dataset variants, thirteen poisoning strategies—including document injection and semantic obfuscation—and seven defense mechanisms. It supports evaluation across diverse RAG architectures: sequential, branching, cyclic, and conditional RAG, as well as multimodal RAG and LLM-based agents.
Contribution/Results: We propose the first standardized evaluation protocol for RAG poisoning. Empirical results reveal that all tested RAG architectures exhibit significant vulnerability; attack success rates drop by over 40% on extended datasets; and existing defenses demonstrate limited robustness—achieving at most a 15% improvement in attack resistance. This benchmark establishes a foundation for rigorous, reproducible evaluation of RAG security.
📝 Abstract
Retrieval-Augmented Generation (RAG) has proven effective in mitigating hallucinations in large language models by incorporating external knowledge during inference. However, this integration introduces new security vulnerabilities, particularly to poisoning attacks. Although prior work has explored various poisoning strategies, a thorough assessment of their practical threat to RAG systems remains missing. To address this gap, we propose the first comprehensive benchmark framework for evaluating poisoning attacks on RAG. Our benchmark covers 5 standard question answering (QA) datasets and 10 expanded variants, along with 13 poisoning attack methods and 7 defense mechanisms, representing a broad spectrum of existing techniques. Using this benchmark, we conduct a comprehensive evaluation of all included attacks and defenses across the full dataset spectrum. Our findings show that while existing attacks perform well on standard QA datasets, their effectiveness drops significantly on the expanded versions. Moreover, our results demonstrate that various advanced RAG architectures, such as sequential, branching, conditional, and loop RAG, as well as multi-turn conversational RAG, multimodal RAG systems, and RAG-based LLM agent systems, remain susceptible to poisoning attacks. Notably, current defense techniques fail to provide robust protection, underscoring the pressing need for more resilient and generalizable defense strategies.