LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search

📅 2025-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In large-scale approximate nearest neighbor (ANN) search, two key bottlenecks hinder performance: (1) blind partition probing during query processing—static center-distance-based ranking ignores underlying data distribution, leading to unnecessary partition accesses; and (2) boundary effects in index construction—partition boundary misalignment induces heavy-tailed kNN distributions, degrading nprobe efficiency. To address these, we propose the first query-aware, end-to-end learnable partition probing framework that dynamically adapts nprobe per query. We further introduce redundancy-aware partition optimization and a joint training strategy to mitigate tail effects while maintaining low query fan-out. Evaluated on multiple real-world vector datasets, our method achieves an average 35% latency reduction and 12% recall improvement over state-of-the-art baselines including FAISS and IVF-PQ, significantly enhancing the trade-off among accuracy, latency, and fan-out.

Technology Category

Application Category

📝 Abstract
Approximate nearest neighbor search is fundamental in information retrieval. Previous partition-based methods enhance search efficiency by probing partial partitions, yet they face two common issues. In the query phase, a common strategy is to probe partitions based on the distance ranks of a query to partition centroids, which inevitably probes irrelevant partitions as it ignores data distribution. In the partition construction phase, all partition-based methods face the boundary problem that separates a query's nearest neighbors to multiple partitions, resulting in a long-tailed kNN distribution and degrading the optimal nprobe (i.e., the number of probing partitions). To address this gap, we propose LIRA, a LearnIng-based queRy-aware pArtition framework. Specifically, we propose a probing model to directly probe the partitions containing the kNN of a query, which can reduce probing waste and allow for query-aware probing with nprobe individually. Moreover, we incorporate the probing model into a learning-based redundancy strategy to mitigate the adverse impact of the long-tailed kNN distribution on search efficiency. Extensive experiments on real-world vector datasets demonstrate the superiority of LIRA in the trade-off among accuracy, latency, and query fan-out. The codes are available at https://github.com/SimoneZeng/LIRA-ANN-search.
Problem

Research questions and friction points this paper is trying to address.

Improves query efficiency by reducing irrelevant partition probing
Addresses boundary problem in partition-based nearest neighbor search
Optimizes search accuracy, latency, and query fan-out trade-off
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning-based query-aware partition framework
Probing model for direct kNN partition access
Redundancy strategy for long-tailed distribution
🔎 Similar Papers
No similar papers found.
Ximu Zeng
Ximu Zeng
University of Electronic Science and Technology of China
Vector SearchAI4DBAutonomous Driving
L
Liwei Deng
University of Electronic Science and Technology of China
P
Penghao Chen
University of Electronic Science and Technology of China
X
Xu Chen
University of Electronic Science and Technology of China
H
Han Su
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China
K
Kai Zheng
University of Electronic Science and Technology of China, Yangtze Delta Region Institute (Quzhou), School of Computer Science and Engineering, UESTC, Shenzhen Institute for Advanced Study, UESTC