Accelerating High-Dimensional Nearest Neighbor Search with Dynamic Query Preference

📅 2025-08-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of non-uniform query distributions in high-dimensional approximate nearest neighbor search (ANNS), where existing graph-based indexes (e.g., HNSW, NSG) struggle to efficiently serve frequent queries, this paper proposes the Dual-Index Query Framework (DQF). DQF introduces two key innovations: (1) a hierarchical index structure comprising a “hot” index for frequently issued queries and a full index for comprehensive coverage, enabling localized, rapid retrieval for hot queries; and (2) a decision-tree-based dynamic search strategy that classifies query types in real time and enables early termination—adapting seamlessly to evolving query distributions without index reconstruction. Extensive experiments on four real-world datasets demonstrate that DQF achieves 2.0–5.7× speedup over state-of-the-art baselines while maintaining 95% recall, significantly enhancing both efficiency and practicality in dynamic query scenarios.

Technology Category

Application Category

📝 Abstract
Approximate Nearest Neighbor Search (ANNS) is a crucial operation in databases and artificial intelligence. Current graph-based ANNS methods, such as HNSW and NSG, have shown remarkable performance but are designed under the assumption of a uniform query distribution. However, in practical scenarios, user preferences and query temporal dynamics lead to some queries being searched for more frequently than others. To fully utilize these characteristics, we propose DQF, a novel Dual-Index Query Framework. This framework comprises a dual-layer index structure and a dynamic search strategy based on a decision tree. The dual-layer index structure comprises a hot index for high-frequency nodes and a full index for the entire dataset, allowing for the separate management of hot and cold queries. Furthermore, we propose a dynamic search strategy that employs a decision tree to adapt to the specific characteristics of each query. The decision tree evaluates whether a query is of the high-frequency type to detect the opportunities for early termination on the dual-layer, avoiding unnecessary searches in the full index. Experimental results on four real-world datasets demonstrate that the Dual-Index Query Framework achieves a significant speedup of 2.0-5.7x over state-of-the-art algorithms while maintaining a 95% recall rate. Importantly, it does not require full index reconstruction when query distributions change, underscoring its efficiency and practicality in dynamic query distribution scenarios.
Problem

Research questions and friction points this paper is trying to address.

Optimizes high-dimensional nearest neighbor search for dynamic query distributions
Manages high-frequency and low-frequency queries with dual-layer index structure
Adapts search strategy using decision trees for early termination opportunities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-layer index for hot and cold queries
Decision tree for dynamic search strategy
No full index rebuild for query changes
🔎 Similar Papers
No similar papers found.