🤖 AI Summary
This work addresses the security vulnerabilities of Graph RAG systems, which, while enhancing large language model reasoning, expose sensitive knowledge subgraphs to adversarial reconstruction, privacy leakage, or intellectual property theft. The authors propose GRASP, a novel attack framework that, under black-box, multi-turn interaction settings, formulates subgraph extraction as a contextual processing problem. By integrating format-compliant outputs with momentum-aware query scheduling, GRASP enables efficient and stealthy reconstruction of protected Graph RAG systems. Evaluated on two real-world knowledge graphs, GRASP achieves up to an 82.9 F1 score in type-faithful subgraph recovery across diverse Graph RAG architectures and safety-aligned large models. Additionally, the study introduces two lightweight defense mechanisms that substantially reduce reconstruction success rates without compromising system utility.
📝 Abstract
Graph-based retrieval-augmented generation (Graph RAG) is increasingly deployed to support LLM applications by augmenting user queries with structured knowledge retrieved from a knowledge graph. While Graph RAG improves relational reasoning, it introduces a largely understudied threat: adversaries can reconstruct subgraphs from a target RAG system's knowledge graph, enabling privacy inference and replication of curated knowledge assets. We show that existing attacks are largely ineffective against Graph RAG even with simple prompt-based safeguards, because these attacks expose explicit exfiltration intent and are therefore easily suppressed by lightweight safe prompts. We identify three technical challenges for practical Graph RAG extraction under realistic safeguards and introduce GRASP, a closed-box, multi-turn subgraph reconstruction attack. GRASP (i) reframes extraction as a context-processing task, (ii) enforces format-compliant, instance-grounded outputs via per-record identifiers to reduce hallucinations and preserve relational details, and (iii) diversifies goal-driven attack queries using a momentum-aware scheduler to operate within strict query budgets. Across two real-world knowledge graphs, four safety-aligned LLMs, and multiple Graph RAG frameworks, GRASP attains the strongest type-faithful reconstruction where prior methods fail, reaching up to 82.9 F1. We further evaluate defenses and propose two lightweight mitigations that substantially reduce reconstruction fidelity without utility loss.