Efficient Approximate Nearest Neighbor Search under Multi-Attribute Range Filter

📅 2026-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently supporting multi-attribute numeric range constraints in high-dimensional nearest neighbor search. To this end, the authors propose KHI, a novel index structure that, for the first time, enables effective support for range-filtered k-nearest neighbor queries with multiple attributes. KHI integrates a hierarchical partitioning tree over the attribute space with Hierarchical Navigable Small World (HNSW) graphs embedded within each node. It employs a skew-aware splitting strategy to control tree height and performs greedy HNSW search along tree paths. Experimental evaluation on four real-world datasets demonstrates that KHI achieves an average throughput improvement of 2.46× over the state-of-the-art method, with gains reaching up to 16.22×, particularly excelling under low selectivity, large k values, and high predicate cardinality scenarios.

Technology Category

Application Category

📝 Abstract
Nearest neighbor search on high-dimensional vectors is fundamental in modern AI and database systems. In many real-world applications, queries involve constraints on multiple numeric attributes, giving rise to range-filtering approximate nearest neighbor search (RFANNS). While there exist RFANNS indexes for single-attribute range predicates, extending them to the multi-attribute setting is nontrivial and often ineffective. In this paper, we propose KHI, an index for multi-attribute RFANNS that combines an attribute-space partitioning tree with HNSW graphs attached to tree nodes. A skew-aware splitting rule bounds the tree height by $O(\log n)$, and queries are answered by routing through the tree and running greedy search on the HNSW graphs. Experiments on four real-world datasets show that KHI consistently achieves high query throughput while maintaining high recall. Compared with the state-of-the-art RFANNS baseline, KHI improves QPS by $2.46\times$ on average and up to $16.22\times$ on the hard dataset, with larger gains for smaller selectivity, larger $k$, and higher predicate cardinality.
Problem

Research questions and friction points this paper is trying to address.

Approximate Nearest Neighbor Search
Multi-Attribute Range Filter
High-Dimensional Vectors
Range Filtering
RFANNS
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-attribute range filtering
approximate nearest neighbor search
HNSW graph
skew-aware partitioning
KHI index
🔎 Similar Papers
No similar papers found.