🤖 AI Summary
This work systematically uncovers a novel security vulnerability in GraphRAG—its susceptibility to poisoning attacks in knowledge graph–enhanced retrieval-augmented generation. While its graph structure improves robustness against simple poisoning, it introduces a new attack surface rooted in shared dependency relationships among queries. To exploit this, we propose GRAGPoison—the first knowledge graph–aware collaborative multi-query poisoning framework—which innovatively leverages shared relations to execute three-stage attacks: relation injection, amplification, and narrative generation. Our empirical adversarial evaluation integrates multi-scale graph indexing, relation path analysis, and controllable text generation. Across multiple datasets and models, GRAGPoison achieves up to 98% attack success rate, requiring only <68% poisoned documents to significantly outperform baselines. Crucially, we empirically demonstrate structural limitations of prevailing defenses, establishing a critical benchmark and charting new directions for GraphRAG security research.
📝 Abstract
GraphRAG advances retrieval-augmented generation (RAG) by structuring external knowledge as multi-scale knowledge graphs, enabling language models to integrate both broad context and granular details in their reasoning. While GraphRAG has demonstrated success across domains, its security implications remain largely unexplored. To bridge this gap, this work examines GraphRAG's vulnerability to poisoning attacks, uncovering an intriguing security paradox: compared to conventional RAG, GraphRAG's graph-based indexing and retrieval enhance resilience against simple poisoning attacks; meanwhile, the same features also create new attack surfaces. We present GRAGPoison, a novel attack that exploits shared relations in the knowledge graph to craft poisoning text capable of compromising multiple queries simultaneously. GRAGPoison employs three key strategies: i) relation injection to introduce false knowledge, ii) relation enhancement to amplify poisoning influence, and iii) narrative generation to embed malicious content within coherent text. Empirical evaluation across diverse datasets and models shows that GRAGPoison substantially outperforms existing attacks in terms of effectiveness (up to 98% success rate) and scalability (using less than 68% poisoning text). We also explore potential defensive measures and their limitations, identifying promising directions for future research.