Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search

📅 2024-05-20
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the low routing efficiency in shard-based approximate Maximum Inner Product Search (MIPS). We propose an uncertainty-aware optimistic routing framework that estimates potential maximum inner products within each cluster shard using first- and second-order moments of the inner-product distribution, enabling efficient shard pruning. Our key contributions are threefold: (i) introducing the principle of “optimism under uncertainty” into MIPS routing design; (ii) developing a lightweight routing algorithm relying solely on moment statistics; and (iii) constructing a data-size-invariant second-moment sketch structure with per-shard memory overhead of only *O*(1) vectors. Evaluated on standard benchmarks, our method achieves routing accuracy comparable to state-of-the-art approaches such as ScaNN, while reducing the number of probed vectors by up to 50%. The framework thus delivers both high accuracy and superior space efficiency.

Technology Category

Application Category

📝 Abstract
Clustering-based nearest neighbor search is an effective method in which points are partitioned into geometric shards to form an index, with only a few shards searched during query processing to find a set of top-$k$ vectors. Even though the search efficacy is heavily influenced by the algorithm that identifies the shards to probe, it has received little attention in the literature. This work bridges that gap by studying routing in clustering-based maximum inner product search. We unpack existing routers and notice the surprising contribution of optimism. We then take a page from the sequential decision making literature and formalize that insight following the principle of ``optimism in the face of uncertainty.'' In particular, we present a framework that incorporates the moments of the distribution of inner products within each shard to estimate the maximum inner product. We then present an instance of our algorithm that uses only the first two moments to reach the same accuracy as state-of-the-art routers such as ScaNN by probing up to $50%$ fewer points on benchmark datasets. Our algorithm is also space-efficient: we design a sketch of the second moment whose size is independent of the number of points and requires $mathcal{O}(1)$ vectors per shard.
Problem

Research questions and friction points this paper is trying to address.

Optimizing shard selection in clustering-based MIPS
Reducing queried points while maintaining search accuracy
Developing space-efficient sketches for second moment estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses optimism principle for routing decisions
Incorporates distribution moments for inner product estimation
Employs space-efficient sketch for second moment
🔎 Similar Papers
No similar papers found.