Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the risk of sensitive data leakage in retrieval-augmented generation (RAG) systems under multi-turn queries, proposing a differentially private RAG framework that balances privacy and utility. Methodologically, it introduces MURAG and MURAG-ADA—algorithms enabling personalized, document-level privacy filtering and adaptive threshold release. Crucially, privacy loss grows linearly with the number of times a document is retrieved, rather than with the total number of queries, thereby overcoming the traditional bottleneck of cumulative privacy budget exhaustion. The approach integrates dynamic privacy budget allocation with a privatized threshold mechanism. Extensive evaluation across multiple LLMs and datasets demonstrates that, under practical privacy budgets (ε ≈ 10) over hundreds of queries, the framework achieves significantly higher question-answering accuracy than single-query baselines—validating both theoretical soundness and engineering viability.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and datasets demonstrate that the proposed methods scale to hundreds of queries within a practical DP budget ($varepsilonapprox10$), while preserving meaningful utility.
Problem

Research questions and friction points this paper is trying to address.

Protecting sensitive data in multi-query retrieval-augmented generation systems
Addressing privacy leakage risks in LLMs when handling confidential documents
Scaling differential privacy guarantees across hundreds of queries while maintaining utility
Innovation

Methods, ideas, or system contributions that make the work stand out.

Individual privacy filters control accumulated privacy loss
Query-specific thresholds enable precise document selection
Multi-query DP-RAG algorithms scale to hundreds of queries
🔎 Similar Papers
No similar papers found.