Prune, Don't Rebuild: Efficiently Tuning $\alpha$-Reachable Graphs for Nearest Neighbor Search

📅 2026-02-08

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the high cost of tuning the α-reachability parameter in large-scale graph-based vector retrieval, which conventionally requires rebuilding the entire index. The authors propose RP-Tuning, the first method enabling dynamic adjustment of α-reachable graphs without reconstruction. By repurposing DiskANN’s pruning mechanism as a post-processing step, RP-Tuning efficiently modifies graph connectivity while preserving theoretical reachability guarantees. The approach is applicable to both general metric spaces and Euclidean space, demonstrating significant practical benefits: across four public datasets, it accelerates DiskANN’s tuning process by up to 43× with negligible additional overhead.

Technology Category

Application Category

📝 Abstract

Vector similarity search is an essential primitive in modern AI and ML applications. Most vector databases adopt graph-based approximate nearest neighbor (ANN) search algorithms, such as DiskANN (Subramanya et al., 2019), which have demonstrated state-of-the-art empirical performance. DiskANN's graph construction is governed by a reachability parameter $\alpha$, which gives a trade-off between construction time, query time, and accuracy. However, adaptively tuning this trade-off typically requires rebuilding the index for different $\alpha$ values, which is prohibitive at scale. In this work, we propose RP-Tuning, an efficient post-hoc routine, based on DiskANN's pruning step, to adjust the $\alpha$ parameter without reconstructing the full index. Within the $\alpha$-reachability framework of prior theoretical works (Indyk and Xu, 2023; Gollapudi et al., 2025), we prove that pruning an initially $\alpha$-reachable graph with RP-Tuning preserves worst-case reachability guarantees in general metrics and improved guarantees in Euclidean metrics. Empirically, we show that RP-Tuning accelerates DiskANN tuning on four public datasets by up to $43\times$ with negligible overhead.

Problem

Research questions and friction points this paper is trying to address.

approximate nearest neighbor search

graph-based indexing

α-reachability

index tuning

vector similarity search

Innovation

Methods, ideas, or system contributions that make the work stand out.

α-reachable graphs

graph pruning

approximate nearest neighbor search

DiskANN

post-hoc tuning

🔎 Similar Papers

Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search

2024-10-02arXiv.orgCitations: 1