π€ AI Summary
This paper addresses privacy-preserving approximate k-nearest neighbor search (PP-kANNS) for high-dimensional vector data in untrusted cloud environments. To this end, we propose the first efficient, secure, and low-user-involvement single-server solution. Our method introduces: (1) a provably secure distance-comparison encryption scheme enabling exact ciphertext distance comparisons; (2) a privacy-preserving index structure that integrates locality-sensitive hashing (LSH) with approximate distance computation; and (3) a filter-then-refine two-phase search strategy to optimize query performance. Experimental results demonstrate that our approach achieves *exactly accurate* retrieval results while accelerating query processing by up to three orders of magnitude over state-of-the-art methods. The solution thus achieves a strong balance among rigorous privacy guarantees, high efficiency, and practical deployability.
π Abstract
In the era of cloud computing and AI, data owners outsource ubiquitous vectors to the cloud, which furnish approximate $k$-nearest neighbors ($k$-ANNS) services to users. To protect data privacy against the untrusted server, privacy-preserving $k$-ANNS (PP-ANNS) on vectors has been a fundamental and urgent problem. However, existing PP-ANNS solutions fall short of meeting the requirements of data privacy, efficiency, accuracy, and minimal user involvement concurrently. To tackle this challenge, we introduce a novel solution that primarily executes PP-ANNS on a single cloud server to avoid the heavy communication overhead between the cloud and the user. To ensure data privacy, we introduce a novel encryption method named distance comparison encryption, facilitating secure, efficient, and exact distance comparisons. To optimize the trade-off between data privacy and search performance, we design a privacy-preserving index that combines the state-of-the-art $k$-ANNS method with an approximate distance computation method. Then, we devise a search method using a filter-and-refine strategy based on the index. Moreover, we provide the security analysis of our solution and conduct extensive experiments to demonstrate its superiority over existing solutions. Based on our experimental results, our method accelerates PP-ANNS by up to 3 orders of magnitude compared to state-of-the-art methods, while not compromising the accuracy.