🤖 AI Summary
This work addresses the underexplored risk of information leakage in retrieval-augmented generation (RAG) systems when incorporating external knowledge, due to the absence of a systematic evaluation framework. The authors propose LeakDojo, a configurable benchmarking platform that enables controlled, comprehensive assessment of RAG leakage threats. Through a multidimensional evaluation encompassing six attack methods, fourteen large language models, four datasets, and multiple RAG architectures, the study quantifies—for the first time—the independent contributions of query generation and adversarial instructions to leakage. It reveals a positive correlation between a model’s instruction-following capability and its susceptibility to leakage, and further demonstrates that enhancing RAG faithfulness may inadvertently exacerbate leakage risks. The findings suggest that overall leakage severity can be approximated as the product of these two factors, offering actionable insights for secure RAG system design.
📝 Abstract
Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to leverage external knowledge, but also exposes valuable RAG databases to leakage attacks. As RAG systems grow more complex and LLMs exhibit stronger instruction-following capabilities, existing studies fall short of systematically assessing RAG leakage risks. We present LeakDojo, a configurable framework for controlled evaluation of RAG leakage. Using LeakDojo, we benchmark six existing attacks across fourteen LLMs, four datasets, and diverse RAG systems. Our study reveals that (1) query generation and adversarial instructions contribute independently to leakage, with overall leakage well approximated by their product; (2) stronger instruction-following capability correlates with higher leakage risk; and (3) improvements in RAG faithfulness can introduce increased leakage risk. These findings provide actionable insights for understanding and mitigating RAG leakage in practice. Our codebase is available at https://github.com/yeasen-z/LeakDojo.