δ-EMG: A Monotonic Graph Index for Approximate Nearest Neighbor Search

📅 2025-11-20

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing ε-recall evaluation paradigms for high-dimensional Approximate Nearest Neighbor (ANN) search ignore bias in false positives and lack formal error guarantees. To address this, this paper proposes the first error-bounded ANN framework, ensuring that all returned results are (1/δ)-approximate to the true nearest neighbors. The core methodological contribution is the construction of the first δ-monotonic graph index with rigorous theoretical guarantees, enabling backtrack-free, efficient query processing. Furthermore, we design a locally degree-balanced δ-EMQG graph integrated with vector quantization, jointly optimizing indexing efficiency and retrieval accuracy. Evaluated on the SIFT1M dataset, our approach achieves 19,000 queries per second (QPS) at 99% recall—surpassing state-of-the-art methods by over 40%.

Technology Category

Application Category

📝 Abstract

Approximate nearest neighbor (ANN) search in high-dimensional spaces is a foundational component of many modern retrieval and recommendation systems. Currently, almost all algorithms follow an $ε$-Recall-Bounded principle when comparing performance: they require the ANN search results to achieve a recall of more than $1-ε$ and then compare query-per-second (QPS) performance. However, this approach only accounts for the recall of true positive results and does not provide guarantees on the deviation of incorrect results. To address this limitation, we focus on an Error-Bounded ANN method, which ensures that the returned results are a $(1/δ)$-approximation of the true values. Our approach adopts a graph-based framework. To enable Error-Bounded ANN search, we propose a $δ$-EMG (Error-bounded Monotonic Graph), which, for the first time, provides a provable approximation for arbitrary queries. By enforcing a $δ$-monotonic geometric constraint during graph construction, $δ$-EMG ensures that any greedy search converges to a $(1/δ)$-approximate neighbor without backtracking. Building on this foundation, we design an error-bounded top-$k$ ANN search algorithm that adaptively controls approximation accuracy during query time. To make the framework practical at scale, we introduce $δ$-EMQG (Error-bounded Monotonic Quantized Graph), a localized and degree-balanced variant with near-linear construction complexity. We further integrate vector quantization to accelerate distance computation while preserving theoretical guarantees. Extensive experiments on the ANN-Benchmarks dataset demonstrate the effectiveness of our approach. Under a recall requirement of 0.99, our algorithm achieves 19,000 QPS on the SIFT1M dataset, outperforming other methods by more than 40%.

Problem

Research questions and friction points this paper is trying to address.

Existing ANN search methods lack guarantees on incorrect result deviation

Graph-based framework for error-bounded approximate nearest neighbor search

Provable approximation for arbitrary queries with monotonic geometric constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Monotonic graph index ensures provable approximation guarantees

Error-bounded top-k search with adaptive accuracy control

Quantized graph variant enables near-linear construction complexity

🔎 Similar Papers

Dimensionality-Reduction Techniques for Approximate Nearest Neighbor Search: A Survey and Evaluation