On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains

📅 2024-09-12
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic investigation of retrieval-augmented generation (RAG) systems’ adversarial vulnerability to universal poisoning attacks targeting the retrieval component in knowledge-intensive domains—including healthcare, finance, and law. Specifically, attackers can inject forged documents containing sensitive information (e.g., PII) into the corpus, enabling arbitrary users to retrieve such content via benign, targeted queries. Method: We propose a detection-based defense grounded in embedding offset pattern analysis—requiring no access to attacker intent or modifications to the retriever architecture. Contribution/Results: Evaluated across 225 configurations spanning diverse models, domains, and RAG settings, our method achieves >95% average detection accuracy—substantially outperforming baselines. Furthermore, we establish the first reproducible framework for modeling and evaluating poisoning attacks against RAG retrieval layers, enabling robustness assessment. This work constitutes the inaugural empirical analysis and defense specifically addressing universal poisoning vulnerabilities in the RAG retrieval stage.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) has been empirically shown to enhance the performance of large language models (LLMs) in knowledge-intensive domains such as healthcare, finance, and legal contexts. Given a query, RAG retrieves relevant documents from a corpus and integrates them into the LLMs' generation process. In this study, we investigate the adversarial robustness of RAG, focusing specifically on examining the retrieval system. First, across 225 different setup combinations of corpus, retriever, query, and targeted information, we show that retrieval systems are vulnerable to universal poisoning attacks in medical Q&A. In such attacks, adversaries generate poisoned documents containing a broad spectrum of targeted information, such as personally identifiable information. When these poisoned documents are inserted into a corpus, they can be accurately retrieved by any users, as long as attacker-specified queries are used. To understand this vulnerability, we discovered that the deviation from the query's embedding to that of the poisoned document tends to follow a pattern in which the high similarity between the poisoned document and the query is retained, thereby enabling precise retrieval. Based on these findings, we develop a new detection-based defense to ensure the safe use of RAG. Through extensive experiments spanning various Q&A domains, we observed that our proposed method consistently achieves excellent detection rates in nearly all cases.
Problem

Research questions and friction points this paper is trying to address.

Investigates adversarial robustness of Retrieval-Augmented Generation (RAG) systems
Examines universal poisoning attacks in medical Q&A retrieval
Proposes detection-based defense for safe RAG usage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Investigates RAG vulnerability to universal poisoning attacks
Analyzes embedding deviation patterns enabling precise retrieval
Develops detection-based defense for safe RAG usage
🔎 Similar Papers
No similar papers found.