π€ AI Summary
This paper addresses the Range-Filtered Approximate k-Nearest Neighbors (RFAKNN) problem: efficiently retrieving the *k* approximate nearest neighbors to a query vector in high-dimensional space, subject to numerical range constraints. Conventional approaches incur high query costβ*O*(logβ―*N*)βdue to strict subtree-range matching in index structures. To overcome this, we propose an elastic graph index coupled with a range relaxation strategy. We theoretically prove, for the first time, that any arbitrary query range can be covered by at most two elastic subranges, thereby eliminating the overhead of divide-and-conquer range matching. Our design preserves retrieval accuracy while substantially improving efficiency: on real-world datasets, it achieves 1.5Γβ6Γ speedup over state-of-the-art methods, maintaining high recall and precision. This work establishes a new trade-off frontier between accuracy and efficiency for RFAKNN.
π Abstract
Range-filtering approximate $k$-nearest neighbor (RFAKNN) search takes as input a vector and a numeric value, returning $k$ points from a database of $N$ high-dimensional points. The returned points must satisfy two criteria: their numeric values must lie within the specified query range, and they must be approximately the $k$ nearest points to the query vector. To strike a better balance between query accuracy and efficiency, we propose novel methods that relax the strict requirement for subranges to extit{exactly} match the query range. This elastic relaxation is based on a theoretical insight: allowing the controlled inclusion of out-of-range points during the search does not compromise the bounded complexity of the search process. Building on this insight, we prove that our methods reduce the number of required subranges to at most extit{two}, eliminating the $O(log N)$ query overhead inherent in existing methods. Extensive experiments on real-world datasets demonstrate that our proposed methods outperform state-of-the-art approaches, achieving performance improvements of 1.5x to 6x while maintaining high accuracy.