E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This work proposes E-MIA, a novel membership inference attack against black-box Retrieval-Augmented Generation (RAG) systems, designed to determine whether a target document has been incorporated into the system’s retrieval corpus. The method innovatively transforms verifiable hard evidence from documents into structured examination-style queries—specifically fill-in-the-blank, single-choice, multiple-choice, and true/false questions—and aggregates response scores across multiple such queries to generate a subtle yet effective membership signal, without relying on semantic similarity or explicit probing. Experimental results demonstrate that E-MIA consistently outperforms existing approaches across diverse datasets and RAG configurations, achieving high discriminative power while preserving the naturalness of queries. This study provides the first empirical validation of the effectiveness and practicality of exam-style questioning for membership inference in RAG systems.

📝 Abstract

Retrieval-Augmented Generation (RAG) equips large language models (LLMs) with external evidence by retrieving documents at inference time, but it also turns the retrieval corpusinto a sensitive asset. Under a black-box setting, an adversary given a candidate document can infer whether it has been ingested into the RAG knowledge base (i.e., document-level membership inference) solely from query response interactions, thereby leaking corpus coverage and the existence of sensitive topics. Existing RAG MIA methods either rely on soft signals such as semantic similarity, which often yield overlapping member/non-member score distributions and unstable thresholds, or employ explicit confirmation probes whose intent is conspicuous and thus prone to refusal and detection. We propose E-MIA, which converts verifiable hard evidence in the target document (e.g., fine-grained details, proper nouns/technical terms, definitional statements, metadata cues, and causal/constraint relations) into an exam with four objectively gradable question types (FB/SC/MC/T/F), and uses the aggregated exam score across multiple evidence targeted questions as the membership signal. Experiments across multiple datasets and diverse RAG configurations demonstrate that E-MIA improves member/non-member separability in stringent settings while preserving natural, stealthy queries, and we further analyze the impact of question composition and exam length on attack effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Membership Inference Attack

Retrieval-Augmented Generation

Black-Box Attack

Document-Level Privacy

RAG Security

Innovation

Methods, ideas, or system contributions that make the work stand out.

Membership Inference Attack

Retrieval-Augmented Generation

Black-Box Attack