E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

📅 2026-05-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

176K/year
🤖 AI Summary
This work proposes E-MIA, a novel membership inference attack against black-box Retrieval-Augmented Generation (RAG) systems, designed to determine whether a target document has been incorporated into the system’s retrieval corpus. The method innovatively transforms verifiable hard evidence from documents into structured examination-style queries—specifically fill-in-the-blank, single-choice, multiple-choice, and true/false questions—and aggregates response scores across multiple such queries to generate a subtle yet effective membership signal, without relying on semantic similarity or explicit probing. Experimental results demonstrate that E-MIA consistently outperforms existing approaches across diverse datasets and RAG configurations, achieving high discriminative power while preserving the naturalness of queries. This study provides the first empirical validation of the effectiveness and practicality of exam-style questioning for membership inference in RAG systems.
📝 Abstract
Retrieval-Augmented Generation (RAG) equips large language models (LLMs) with external evidence by retrieving documents at inference time, but it also turns the retrieval corpusinto a sensitive asset. Under a black-box setting, an adversary given a candidate document can infer whether it has been ingested into the RAG knowledge base (i.e., document-level membership inference) solely from query response interactions, thereby leaking corpus coverage and the existence of sensitive topics. Existing RAG MIA methods either rely on soft signals such as semantic similarity, which often yield overlapping member/non-member score distributions and unstable thresholds, or employ explicit confirmation probes whose intent is conspicuous and thus prone to refusal and detection. We propose E-MIA, which converts verifiable hard evidence in the target document (e.g., fine-grained details, proper nouns/technical terms, definitional statements, metadata cues, and causal/constraint relations) into an exam with four objectively gradable question types (FB/SC/MC/T/F), and uses the aggregated exam score across multiple evidence targeted questions as the membership signal. Experiments across multiple datasets and diverse RAG configurations demonstrate that E-MIA improves member/non-member separability in stringent settings while preserving natural, stealthy queries, and we further analyze the impact of question composition and exam length on attack effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Membership Inference Attack
Retrieval-Augmented Generation
Black-Box Attack
Document-Level Privacy
RAG Security
Innovation

Methods, ideas, or system contributions that make the work stand out.

Membership Inference Attack
Retrieval-Augmented Generation
Black-Box Attack
Hard Evidence Extraction
Exam-Style Probing
Z
Zelin Guan
College of Cyber Security, Jinan University, Guangzhou, China
S
Shengda Zhuo
College of Cyber Security, Jinan University, Guangzhou, China
Zeyan Li
Zeyan Li
ByteDance
AIOpsIntelligent OperationsSoftware Reliability
J
Jinchun He
Institute of Artificial Intelligence, Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China; also with Zhongguancun Laboratory, Beijing, China
W
Wangjie Qiu
Institute of Artificial Intelligence, Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China; also with Zhongguancun Laboratory, Beijing, China
Z
Zhiming Zheng
Institute of Artificial Intelligence, Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, Beihang University, Beijing 100191, China; also with Zhongguancun Laboratory, Beijing, China
S
Shuqiang Huang
College of Cyber Security, Jinan University, Guangzhou, China