Filtered Approximate Nearest Neighbor Search Cost Estimation

๐Ÿ“… 2026-02-06
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the highly variable search cost in hybrid queries arising from the coupling of vector similarity search and structured attribute filtering, which poses significant challenges for effective optimization. The paper proposes E2E, an end-to-end cost estimation framework that, for the first time, explicitly models the correlation between query vector distributions and attribute selectivity. This approach substantially improves the accuracy of cost prediction for filtered approximate nearest neighbor search. Leveraging this estimator, E2E enables an efficient early-termination strategy that achieves 2โ€“3ร— retrieval speedup on real-world datasets while maintaining high recall. By integrating a deep learningโ€“based cost model with vector indexing and structured filtering mechanisms, E2E establishes a novel paradigm for hybrid query optimization.

Technology Category

Application Category

๐Ÿ“ Abstract
Hybrid queries combining high-dimensional vector similarity with structured attribute filtering have garnered significant attention across both academia and industry. A critical instance of this paradigm is filtered Approximate k Nearest Neighbor (AKNN) search, where embeddings (e.g., image or text) are queried alongside constraints such as labels or numerical range. While essential for rich retrieval, optimizing these queries remains challenging due to the highly variable search cost induced by combined filters. In this paper, we propose a novel cost estimation framework, E2E, for filtered AKNN search and demonstrate its utility in downstream optimization tasks, specifically early termination. Unlike existing approaches, our model explicitly captures the correlation between the query vector distribution and attribute-value selectivity, yielding significantly higher estimation accuracy. By leveraging these estimates to refine search termination conditions, we achieve substantial performance gains. Experimental results on real-world datasets demonstrate that our approach improves retrieval efficiency by 2x-3x over state-of-the-art baselines while maintaining high search accuracy.
Problem

Research questions and friction points this paper is trying to address.

filtered approximate nearest neighbor search
cost estimation
hybrid queries
search optimization
attribute filtering
Innovation

Methods, ideas, or system contributions that make the work stand out.

filtered approximate nearest neighbor
cost estimation
vector-attribute correlation
early termination
hybrid query optimization
๐Ÿ”Ž Similar Papers
No similar papers found.