Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation

📅 2024-05-30

🏛️ arXiv.org

📈 Citations: 9

✨ Influential: 3

career value

192K/year

🤖 AI Summary

This work presents the first systematic study of membership inference attacks (MIAs) against retrieval-augmented generation (RAG) systems—novel privacy threats wherein an adversary infers whether a given text resides in a private retrieval database solely by observing model outputs. To address this, we propose the first efficient, black-box- and gray-box-compatible MIA framework tailored for RAG, leveraging prompt engineering to enable gradient-free attacks. We further design a lightweight defense mechanism that mitigates information leakage via controlled perturbation of retrieval template instructions. Extensive evaluation across multiple benchmarks—including Natural Questions and HotpotQA—and diverse open- and closed-source generative models demonstrates strong attack performance (AUC: 0.85–0.92) and effective defense: our mitigation reduces attack success rates by up to 76%, substantially enhancing the privacy robustness of RAG systems.

Technology Category

Application Category

📝 Abstract

Retrieval Augmented Generation (RAG) systems have shown great promise in natural language processing. However, their reliance on data stored in a retrieval database, which may contain proprietary or sensitive information, introduces new privacy concerns. Specifically, an attacker may be able to infer whether a certain text passage appears in the retrieval database by observing the outputs of the RAG system, an attack known as a Membership Inference Attack (MIA). Despite the significance of this threat, MIAs against RAG systems have yet remained under-explored. This study addresses this gap by introducing an efficient and easy-to-use method for conducting MIA against RAG systems. We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models, showing that the membership of a document in the retrieval database can be efficiently determined through the creation of an appropriate prompt in both black-box and gray-box settings. Moreover, we introduce an initial defense strategy based on adding instructions to the RAG template, which shows high effectiveness for some datasets and models. Our findings highlight the importance of implementing security countermeasures in deployed RAG systems and developing more advanced defenses to protect the privacy and security of retrieval databases.

Problem

Research questions and friction points this paper is trying to address.

Investigates Membership Inference Attacks on RAG systems

Develops method to detect data in retrieval databases

Proposes initial defense strategy for RAG systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Membership Inference Attack method

Black-box and gray-box settings

RAG template defense strategy

🔎 Similar Papers

No similar papers found.