🤖 AI Summary
To address the problem of static graph indexes in high-dimensional approximate nearest neighbor (ANN) search—where search and construction logs remain underutilized over time—this paper proposes EnhanceGraph, a continuously self-enhancing graph indexing framework. Its core innovation is the first-ever conjugate graph structure, which synergistically integrates search logs (guiding edge evolution from local to global optimality) and construction logs (driving pruning-aware edge optimization), enabling dynamic index evolution and autonomous quality refinement. The methodology encompasses log-driven reinforcement learning, local-global collaborative routing, proximity-graph pruning modeling, and theoretical convergence analysis. Evaluated on multiple public and industrial datasets, EnhanceGraph achieves substantial recall improvement (from 41.74% to 93.42%) with zero increase in search latency. It has been integrated into VSAG, Ant Group’s open-source vector database.
📝 Abstract
Recently, Approximate Nearest Neighbor Search in high-dimensional vector spaces has garnered considerable attention due to the rapid advancement of deep learning techniques. We observed that a substantial amount of search and construction logs are generated throughout the lifespan of a graph-based index. However, these two types of valuable logs are not fully exploited due to the static nature of existing indexes. We present the EnhanceGraph framework, which integrates two types of logs into a novel structure called a conjugate graph. The conjugate graph is then used to improve search quality. Through theoretical analyses and observations of the limitations of graph-based indexes, we propose several optimization methods. For the search logs, the conjugate graph stores the edges from local optima to global optima to enhance routing to the nearest neighbor. For the construction logs, the conjugate graph stores the pruned edges from the proximity graph to enhance retrieving of k nearest neighbors. Our experimental results on several public and real-world industrial datasets show that EnhanceGraph significantly improves search accuracy with the greatest improvement on recall from 41.74% to 93.42%, but does not sacrifices search efficiency. In addition, our EnhanceGraph algorithm has been integrated into Ant Group's open-source vector library, VSAG.