How Should We Evaluate Data Deletion in Graph-Based ANN Indexes?

πŸ“… 2025-12-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing graph-based approximate nearest neighbor search (ANNS) indices lack systematic evaluation of data deletion, particularly under realistic dynamic workloads. Method: This paper establishes a comprehensive evaluation framework tailored to practical dynamic scenarios and formally categorizes graph-based ANNS deletion methods into three distinct classes for the first time. It further proposes Deletion Controlβ€”a mechanism that dynamically selects deletion strategies via mathematical modeling and benchmark-driven optimization to adaptively balance accuracy and efficiency. Contribution/Results: Evaluated on HNSW, Deletion Control significantly improves deletion throughput and query stability while preserving retrieval accuracy. It enhances index maintainability and practicality in high-frequency update settings, enabling robust operation under continuous insertions and deletions without compromising search quality.

Technology Category

Application Category

πŸ“ Abstract
Approximate Nearest Neighbor Search (ANNS) has recently gained significant attention due to its many applications, such as Retrieval-Augmented Generation. Such applications require ANNS algorithms that support dynamic data, so the ANNS problem on dynamic data has attracted considerable interest. However, a comprehensive evaluation methodology for data deletion in ANNS has yet to be established. This study proposes an experimental framework and comprehensive evaluation metrics to assess the efficiency of data deletion for ANNS indexes under practical use cases. Specifically, we categorize data deletion methods in graph-based ANNS into three approaches and formalize them mathematically. The performance is assessed in terms of accuracy, query speed, and other relevant metrics. Finally, we apply the proposed evaluation framework to Hierarchical Navigable Small World, one of the state-of-the-art ANNS methods, to analyze the effects of data deletion, and propose Deletion Control, a method which dynamically selects the appropriate deletion method under a required search accuracy.
Problem

Research questions and friction points this paper is trying to address.

Evaluates data deletion efficiency in graph-based ANN indexes
Proposes framework to assess deletion methods' accuracy and speed
Analyzes deletion effects on Hierarchical Navigable Small World indexes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes evaluation framework for data deletion
Categorizes deletion methods in graph-based ANNS
Introduces dynamic deletion control method
πŸ”Ž Similar Papers
No similar papers found.