Privacy-Preserving Approximate Nearest Neighbor Search on High-Dimensional Data

📅 2025-08-14

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses privacy-preserving approximate k-nearest neighbor search (PP-kANNS) for high-dimensional vector data in untrusted cloud environments. To this end, we propose the first efficient, secure, and low-user-involvement single-server solution. Our method introduces: (1) a provably secure distance-comparison encryption scheme enabling exact ciphertext distance comparisons; (2) a privacy-preserving index structure that integrates locality-sensitive hashing (LSH) with approximate distance computation; and (3) a filter-then-refine two-phase search strategy to optimize query performance. Experimental results demonstrate that our approach achieves *exactly accurate* retrieval results while accelerating query processing by up to three orders of magnitude over state-of-the-art methods. The solution thus achieves a strong balance among rigorous privacy guarantees, high efficiency, and practical deployability.

Technology Category

Application Category

📝 Abstract

In the era of cloud computing and AI, data owners outsource ubiquitous vectors to the cloud, which furnish approximate $k$-nearest neighbors ($k$-ANNS) services to users. To protect data privacy against the untrusted server, privacy-preserving $k$-ANNS (PP-ANNS) on vectors has been a fundamental and urgent problem. However, existing PP-ANNS solutions fall short of meeting the requirements of data privacy, efficiency, accuracy, and minimal user involvement concurrently. To tackle this challenge, we introduce a novel solution that primarily executes PP-ANNS on a single cloud server to avoid the heavy communication overhead between the cloud and the user. To ensure data privacy, we introduce a novel encryption method named distance comparison encryption, facilitating secure, efficient, and exact distance comparisons. To optimize the trade-off between data privacy and search performance, we design a privacy-preserving index that combines the state-of-the-art $k$-ANNS method with an approximate distance computation method. Then, we devise a search method using a filter-and-refine strategy based on the index. Moreover, we provide the security analysis of our solution and conduct extensive experiments to demonstrate its superiority over existing solutions. Based on our experimental results, our method accelerates PP-ANNS by up to 3 orders of magnitude compared to state-of-the-art methods, while not compromising the accuracy.

Problem

Research questions and friction points this paper is trying to address.

Privacy-preserving nearest neighbor search on high-dimensional data

Balancing data privacy, efficiency, and accuracy in cloud-based ANNS

Minimizing user involvement while securing data on untrusted servers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single server PP-ANNS reduces communication overhead

Distance comparison encryption ensures secure exact comparisons

Privacy-preserving index balances privacy and search performance

🔎 Similar Papers

Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions