δ-EMG: A Monotonic Graph Index for Approximate Nearest Neighbor Search

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing ε-recall evaluation paradigms for high-dimensional Approximate Nearest Neighbor (ANN) search ignore bias in false positives and lack formal error guarantees. To address this, this paper proposes the first error-bounded ANN framework, ensuring that all returned results are (1/δ)-approximate to the true nearest neighbors. The core methodological contribution is the construction of the first δ-monotonic graph index with rigorous theoretical guarantees, enabling backtrack-free, efficient query processing. Furthermore, we design a locally degree-balanced δ-EMQG graph integrated with vector quantization, jointly optimizing indexing efficiency and retrieval accuracy. Evaluated on the SIFT1M dataset, our approach achieves 19,000 queries per second (QPS) at 99% recall—surpassing state-of-the-art methods by over 40%.

Technology Category

Application Category

📝 Abstract
Approximate nearest neighbor (ANN) search in high-dimensional spaces is a foundational component of many modern retrieval and recommendation systems. Currently, almost all algorithms follow an $ε$-Recall-Bounded principle when comparing performance: they require the ANN search results to achieve a recall of more than $1-ε$ and then compare query-per-second (QPS) performance. However, this approach only accounts for the recall of true positive results and does not provide guarantees on the deviation of incorrect results. To address this limitation, we focus on an Error-Bounded ANN method, which ensures that the returned results are a $(1/δ)$-approximation of the true values. Our approach adopts a graph-based framework. To enable Error-Bounded ANN search, we propose a $δ$-EMG (Error-bounded Monotonic Graph), which, for the first time, provides a provable approximation for arbitrary queries. By enforcing a $δ$-monotonic geometric constraint during graph construction, $δ$-EMG ensures that any greedy search converges to a $(1/δ)$-approximate neighbor without backtracking. Building on this foundation, we design an error-bounded top-$k$ ANN search algorithm that adaptively controls approximation accuracy during query time. To make the framework practical at scale, we introduce $δ$-EMQG (Error-bounded Monotonic Quantized Graph), a localized and degree-balanced variant with near-linear construction complexity. We further integrate vector quantization to accelerate distance computation while preserving theoretical guarantees. Extensive experiments on the ANN-Benchmarks dataset demonstrate the effectiveness of our approach. Under a recall requirement of 0.99, our algorithm achieves 19,000 QPS on the SIFT1M dataset, outperforming other methods by more than 40%.
Problem

Research questions and friction points this paper is trying to address.

Existing ANN search methods lack guarantees on incorrect result deviation
Graph-based framework for error-bounded approximate nearest neighbor search
Provable approximation for arbitrary queries with monotonic geometric constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monotonic graph index ensures provable approximation guarantees
Error-bounded top-k search with adaptive accuracy control
Quantized graph variant enables near-linear construction complexity
L
Liming Xiang
Beijing Institute of Technology, Beijing, China
J
Jing Feng
Beijing Institute of Technology, Beijing, China
Ziqi Yin
Ziqi Yin
Jilin University
unsupervised domain adaptation、prompt learning
Z
Zijian Li
Huawei Technologies Ltd, China
D
Daihao Xue
Huawei Technologies Ltd, China
Hongchao Qin
Hongchao Qin
Beijing Institute of Technology
Graph Data Mining
R
Ronghua Li
Beijing Institute of Technology, Beijing, China
Guoren Wang
Guoren Wang
Beijing Institute of Technology