Efficiently Constructing Sparse Navigable Graphs

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Constructing sparse navigable graphs for graph-based nearest neighbor search incurs prohibitive computational overhead, especially under arbitrary distance functions. Method: We propose the first subquadratic-time algorithm to construct an approximately optimal sparse navigable graph for any distance function. Our approach innovatively integrates streaming set cover with sublinear-time algorithms into the graph construction framework, augmented by problem-specific preprocessing. Results: Given $n$ data points, our algorithm runs in $ ilde{O}(n^2)$ time and yields a graph that is an $O(log n)$-approximation to the sparsest navigable graph. We prove this approximation ratio matches the NP-hardness lower bound. Moreover, for $alpha$-shortcut reachable graphs and $ au$-monotonic graphs, our method achieves complexity below $ ilde{O}(n^{2.5})$, approaching the theoretical limit implied by the Strong Exponential Time Hypothesis (SETH).

Technology Category

Application Category

📝 Abstract
Graph-based nearest neighbor search methods have seen a surge of popularity in recent years, offering state-of-the-art performance across a wide variety of applications. Central to these methods is the task of constructing a sparse navigable search graph for a given dataset endowed with a distance function. Unfortunately, doing so is computationally expensive, so heuristics are universally used in practice. In this work, we initiate the study of fast algorithms with provable guarantees for search graph construction. For a dataset with $n$ data points, the problem of constructing an optimally sparse navigable graph can be framed as $n$ separate but highly correlated minimum set cover instances. This yields a naive $O(n^3)$ time greedy algorithm that returns a navigable graph whose sparsity is at most $O(log n)$ higher than optimal. We improve significantly on this baseline, taking advantage of correlation between the set cover instances to leverage techniques from streaming and sublinear-time set cover algorithms. Combined with problem-specific pre-processing techniques, we present an $ ilde{O}(n^2)$ time algorithm for constructing an $O(log n)$-approximate sparsest navigable graph under any distance function. The runtime of our method is optimal up to logarithmic factors under the Strong Exponential Time Hypothesis via a reduction from Monochromatic Closest Pair. Moreover, we prove that, as with general set cover, obtaining better than an $O(log n)$-approximation is NP-hard, despite the significant additional structure present in the navigable graph problem. Finally, we show that our techniques can also beat cubic time for the closely related and practically important problems of constructing $α$-shortcut reachable and $τ$-monotonic graphs, which are also used for nearest neighbor search. For such graphs, we obtain $ ilde{O}(n^{2.5})$ time or better algorithms.
Problem

Research questions and friction points this paper is trying to address.

Efficiently constructing sparse navigable graphs for nearest neighbor search
Reducing computational complexity of graph construction from O(n^3) to O(n^2)
Achieving O(log n)-approximate sparsity with provable guarantees
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimally sparse navigable graph construction
Streaming and sublinear-time set cover techniques
Problem-specific pre-processing for efficiency
🔎 Similar Papers
No similar papers found.