FGIM: a Fast Graph-based Indexes Merging Framework for Approximate Nearest Neighbor Search

📅 2026-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently merging graph indices in distributed systems and real-time vector databases, a problem previously lacking systematic investigation. To this end, the authors propose FGIM, a general and efficient three-stage framework for graph index merging. FGIM first converts input navigable graphs (e.g., HNSW) into k-nearest neighbor graphs (k-NNGs), then enhances neighbor quality and graph connectivity through cross-query candidate extraction and k-NNG refinement, and finally reconstructs a high-quality navigable graph. Extensive experiments on six real-world datasets demonstrate that FGIM achieves up to 3.5× speedup over incremental HNSW construction and averages 7.9× acceleration compared to non-incremental baselines, while maintaining comparable or superior retrieval accuracy.

Technology Category

Application Category

📝 Abstract
As the state-of-the-art methods for high-dimensional data retrieval, Approximate Nearest Neighbor Search (ANNS) approaches with graph-based indexes have attracted increasing attention and play a crucial role in many real-world applications, e.g., retrieval-augmented generation (RAG) and recommendation systems. Unlike the extensive works focused on designing efficient graph-based ANNS methods, this paper delves into merging multiple existing graph-based indexes into a single one, which is also crucial in many real-world scenarios (e.g., cluster consolidation in distributed systems and read-write contention in real-time vector databases). We propose a Fast Graph-based Indexes Merging (FGIM) framework with three core techniques: (1) Proximity Graphs (PGs) to $k$ Nearest Neighbor Graph ($k$-NNG) transformation used to extract potential candidate neighbors from input graph-based indexes through cross-querying, (2) $k$-NNG refinement designed to identify overlooked high-quality neighbors and maintain graph connectivity, and (3) $k$-NNG to PG transformation aimed at improving graph navigability and enhancing search performance. Then, we integrate our FGIM framework with the state-of-the-art ANNS method, HNSW, and other existing mainstream graph-based methods to demonstrate its generality and merging efficiency. Extensive experiments on six real-world datasets show that our FGIM framework is applicable to various mainstream graph-based ANNS methods, achieves up to 3.5$\times$ speedup over HNSW's incremental construction and an average of 7.9$\times$ speedup for methods without incremental support, while maintaining comparable or superior search performance.
Problem

Research questions and friction points this paper is trying to address.

Approximate Nearest Neighbor Search
Graph-based Indexes
Index Merging
Vector Databases
Distributed Systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based Index Merging
Approximate Nearest Neighbor Search
k-Nearest Neighbor Graph
Index Transformation
HNSW
🔎 Similar Papers
No similar papers found.
Z
Zekai Wu
East China Normal University, China
Jiabao Jin
Jiabao Jin
Ant Group
Vector DataBase
P
Peng Cheng
Tongji University, China
X
Xiaoyao Zhong
Ant Group, China
Lei Chen
Lei Chen
Hong Kong University of Science and Technology
Human Powered Machine LearningDatabasesData Mining
Y
Yongxin Tong
Beihang University, China
Zhitao Shen
Zhitao Shen
Ant Group
databasedata storage
J
Jingkuan Song
Tongji University, China
H
Heng Tao Shen
Tongji University, China
X
Xuemin Lin
Shanghai Jiaotong University, China