Privacy-Preserving Approximate Nearest Neighbor Search on High-Dimensional Data

πŸ“… 2025-08-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses privacy-preserving approximate k-nearest neighbor search (PP-kANNS) for high-dimensional vector data in untrusted cloud environments. To this end, we propose the first efficient, secure, and low-user-involvement single-server solution. Our method introduces: (1) a provably secure distance-comparison encryption scheme enabling exact ciphertext distance comparisons; (2) a privacy-preserving index structure that integrates locality-sensitive hashing (LSH) with approximate distance computation; and (3) a filter-then-refine two-phase search strategy to optimize query performance. Experimental results demonstrate that our approach achieves *exactly accurate* retrieval results while accelerating query processing by up to three orders of magnitude over state-of-the-art methods. The solution thus achieves a strong balance among rigorous privacy guarantees, high efficiency, and practical deployability.

Technology Category

Application Category

πŸ“ Abstract
In the era of cloud computing and AI, data owners outsource ubiquitous vectors to the cloud, which furnish approximate $k$-nearest neighbors ($k$-ANNS) services to users. To protect data privacy against the untrusted server, privacy-preserving $k$-ANNS (PP-ANNS) on vectors has been a fundamental and urgent problem. However, existing PP-ANNS solutions fall short of meeting the requirements of data privacy, efficiency, accuracy, and minimal user involvement concurrently. To tackle this challenge, we introduce a novel solution that primarily executes PP-ANNS on a single cloud server to avoid the heavy communication overhead between the cloud and the user. To ensure data privacy, we introduce a novel encryption method named distance comparison encryption, facilitating secure, efficient, and exact distance comparisons. To optimize the trade-off between data privacy and search performance, we design a privacy-preserving index that combines the state-of-the-art $k$-ANNS method with an approximate distance computation method. Then, we devise a search method using a filter-and-refine strategy based on the index. Moreover, we provide the security analysis of our solution and conduct extensive experiments to demonstrate its superiority over existing solutions. Based on our experimental results, our method accelerates PP-ANNS by up to 3 orders of magnitude compared to state-of-the-art methods, while not compromising the accuracy.
Problem

Research questions and friction points this paper is trying to address.

Privacy-preserving nearest neighbor search on high-dimensional data
Balancing data privacy, efficiency, and accuracy in cloud-based ANNS
Minimizing user involvement while securing data on untrusted servers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single server PP-ANNS reduces communication overhead
Distance comparison encryption ensures secure exact comparisons
Privacy-preserving index balances privacy and search performance
πŸ”Ž Similar Papers
No similar papers found.
Yingfan Liu
Yingfan Liu
Xidian University
Vector DatabaseHigh-performance Computations
Y
Yandi Zhang
School of Computer Science and Technology, Xidian University, Xi’an, China
J
Jiadong Xie
The Chinese University of Hong Kong, Hong Kong SAR, China
H
Hui Li
School of Computer Science and Technology, Xidian University, Xi’an, China
Jeffrey Xu Yu
Jeffrey Xu Yu
Chinese University of Hong Kong
DatabaseData Mining
J
Jiangtao Cui
School of Computer Science and Technology, Xidian University, Xi’an, China