Graph-Based Nearest-Neighbor Search without the Spread

📅 2026-02-06

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the dependency of approximate nearest neighbor (ANN) query time on the dataset’s spread—a limitation that often leads to degraded efficiency in high-dimensional or unevenly distributed data. To overcome this, the authors propose a novel approach that integrates proximity graphs with an external linear-size data structure. The method maintains O(n) space complexity while reducing query time complexity from being spread-dependent to O(log n), thereby achieving the first ANN query algorithm whose performance depends solely on the number of points n and scales logarithmically with it. Theoretical analysis and design demonstrate that the proposed solution effectively decouples query efficiency from intrinsic data distribution characteristics, significantly enhancing scalability and practicality in high-dimensional settings.

Technology Category

Application Category

📝 Abstract

$\renewcommand{\Re}{\mathbb{R}}$Recent work showed how to construct nearest-neighbor graphs of linear size, on a given set $P$ of $n$ points in $\Re^d$, such that one can answer approximate nearest-neighbor queries in logarithmic time in the spread. Unfortunately, the spread might be unbounded in $n$, and an interesting theoretical question is how to remove the dependency on the spread. Here, we show how to construct an external linear-size data structure that, combined with the linear-size graph, allows us to answer ANN queries in logarithmic time in $n$.

Problem

Research questions and friction points this paper is trying to address.

nearest-neighbor search

spread

approximate nearest neighbor

graph-based search

logarithmic query time

Innovation

Methods, ideas, or system contributions that make the work stand out.

approximate nearest neighbor

graph-based search

spread-independent