🤖 AI Summary
This paper studies the sparsest construction of α-navigable graphs: given an n-point metric space (P, d) and α ≥ 1, construct a directed graph G = (P, E) such that for every distinct pair s, t ∈ P, there exists an edge (s, u) ∈ E satisfying d(u, t) < d(s, t)/α; the objective is to minimize either the maximum out-degree or the total number of edges. The authors establish an approximation-preserving equivalence between this problem and Set Cover, proving its NP-hardness and an Ω(n²) lower bound on distance queries. They achieve two theoretical breakthroughs: (1) the first polynomial-time O(log n)-approximation algorithm; and (2) two efficient implementations—the first runs in Õ(n · OPT) time leveraging sparsity of optimal solutions, and the second runs in Õ(n^ω) time using fast matrix multiplication (where ω < 2.373 is the matrix multiplication exponent). These results settle the tight computational complexity bounds for constructing sparse navigable graphs.
📝 Abstract
We initiate the study of approximation algorithms and computational barriers for constructing sparse $α$-navigable graphs [IX23, DGM+24], a core primitive underlying recent advances in graph-based nearest neighbor search. Given an $n$-point dataset $P$ with an associated metric $mathsf{d}$ and a parameter $αgeq 1$, the goal is to efficiently build the sparsest graph $G=(P, E)$ that is $α$-navigable: for every distinct $s, t in P$, there exists an edge $(s, u) in E$ with $mathsf{d}(u, t) < mathsf{d}(s, t)/α$. We consider two natural sparsity objectives: minimizing the maximum out-degree and minimizing the total size.
We first show a strong negative result: the slow-preprocessing version of DiskANN (analyzed in [IX23] for low-doubling metrics) can yield solutions whose sparsity is $widetildeΩ(n)$ times larger than optimal, even on Euclidean instances. We then show a tight approximation-preserving equivalence between the Sparsest Navigable Graph problem and the classic Set Cover problem, obtaining an $O(n^3)$-time $(ln n + 1)$-approximation algorithm, as well as establishing NP-hardness of achieving an $o(ln n)$-approximation. Building on this equivalence, we develop faster $O(ln n)$-approximation algorithms. The first runs in $widetilde{O}(n cdot mathrm{OPT})$ time and is thus much faster when the optimal solution is sparse. The second, based on fast matrix multiplication, is a bicriteria algorithm that computes an $O(ln n)$-approximation to the sparsest $2α$-navigable graph, running in $widetilde{O}(n^ω)$ time.
Finally, we complement our upper bounds with a query complexity lower bound, showing that any $o(n)$-approximation requires examining $Ω(n^2)$ distances. This result shows that in the regime where $mathrm{OPT} = widetilde{O}(n)$, our $widetilde{O}(n cdot mathrm{OPT})$-time algorithm is essentially best possible.