One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the critical vulnerability in Retrieval-Augmented Generation (RAG) systems wherein knowledge bases are susceptible to undetectable single-document knowledge poisoning attacks. We propose the first targeted single-document knowledge poisoning attack. To overcome interference from both authentic documents and large language models’ (LLMs’) intrinsic knowledge, we design AuthChain—a novel framework integrating evidence-chain reasoning, authority-aware semantic modeling, and retrieval-adversarial generation—enabling reliable suppression of target answers without requiring multi-document collaborative injection. AuthChain is compatible with mainstream RAG architectures and diverse LLM backbones. Evaluated on six state-of-the-art LLMs, it achieves significantly higher attack success rates than prior art. Moreover, it remains highly stealthy against existing RAG defenses—including retrieval-result filtering and confidence-based verification—exposing severe security risks arising from isolated knowledge contamination.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have shown improved performance in generating accurate responses. However, the dependence on external knowledge bases introduces potential security vulnerabilities, particularly when these knowledge bases are publicly accessible and modifiable. Poisoning attacks on knowledge bases for RAG systems face two fundamental challenges: the injected malicious content must compete with multiple authentic documents retrieved by the retriever, and LLMs tend to trust retrieved information that aligns with their internal memorized knowledge. Previous works attempt to address these challenges by injecting multiple malicious documents, but such saturation attacks are easily detectable and impractical in real-world scenarios. To enable the effective single document poisoning attack, we propose AuthChain, a novel knowledge poisoning attack method that leverages Chain-of-Evidence theory and authority effect to craft more convincing poisoned documents. AuthChain generates poisoned content that establishes strong evidence chains and incorporates authoritative statements, effectively overcoming the interference from both authentic documents and LLMs' internal knowledge. Extensive experiments across six popular LLMs demonstrate that AuthChain achieves significantly higher attack success rates while maintaining superior stealthiness against RAG defense mechanisms compared to state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

Security vulnerabilities in RAG-enhanced LLMs due to external knowledge bases

Challenges in poisoning attacks: competition with authentic documents and LLM trust

Proposing AuthChain for effective single-document poisoning with high stealthiness

Innovation

Methods, ideas, or system contributions that make the work stand out.

AuthChain leverages Chain-of-Evidence theory

Incorporates authoritative statements effectively

Achieves high attack success stealthily

🔎 Similar Papers

On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains