$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval

πŸ“… 2026-03-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the inefficiency of existing privacy-preserving RAG systems, which rely on secure sorting and thus struggle to support arbitrary top-$k$ retrieval, limiting large language models’ ability to leverage larger retrieval sets for improved accuracy. To overcome this, the authors propose a novel privacy-preserving top-$k$ retrieval mechanism that eliminates explicit sorting. Operating under a two-server semi-honest, non-colluding model, the approach integrates secret sharing, interactive binary search, and access verification to ensure both query and data privacy while supporting any $k$ value. Experimental results demonstrate that, for $k$ ranging from 16 to 1024, the proposed method achieves a 3Γ— to 300Γ— speedup over the current state-of-the-art PRAG system, substantially enhancing retrieval efficiency and flexibility at scale.

Technology Category

Application Category

πŸ“ Abstract
Retrieval-Augmented Generation (RAG) enables large language models to use external knowledge, but outsourcing the RAG service raises privacy concerns for both data owners and users. Privacy-preserving RAG systems address these concerns by performing secure top-$k$ retrieval, which typically is secure sorting to identify relevant documents. However, existing systems face challenges supporting arbitrary $k$ due to their inability to change $k$, new security issues, or efficiency degradation with large $k$. This is a significant limitation because modern long-context models generally achieve higher accuracy with larger retrieval sets. We propose $p^2$RAG, a privacy-preserving RAG service that supports arbitrary top-$k$ retrieval. Unlike existing systems, $p^2$RAG avoids sorting candidate documents. Instead, it uses an interactive bisection method to determine the set of top-$k$ documents. For security, $p^2$RAG uses secret sharing on two semi-honest non-colluding servers to protect the data owner's database and the user's prompt. It enforces restrictions and verification to defend against malicious users and tightly bound the information leakage of the database. The experiments show that $p^2$RAG is 3--300$\times$ faster than the state-of-the-art PRAG for $k = 16$--$1024$.
Problem

Research questions and friction points this paper is trying to address.

Privacy-Preserving RAG
Top-k Retrieval
Arbitrary k
Secure Retrieval
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Privacy-Preserving RAG
Arbitrary Top-k Retrieval
Interactive Bisection
Secret Sharing
Secure Document Retrieval
πŸ”Ž Similar Papers
No similar papers found.
Y
Yulong Ming
Department of Computer Science, City University of Hong Kong, Hong Kong
M
Mingyue Wang
Peng Cheng Laboratory, Shenzhen, Guangdong, China
J
Jijia Yang
Department of Computer Science, City University of Hong Kong, Hong Kong
Cong Wang
Cong Wang
Department of Computer Science, City University of Hong Kong
cloudsecuritybig datacomputation outsourcingaccess control
Xiaohua Jia
Xiaohua Jia
Chinese Academy of Science