Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors

📅 2025-07-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing FANNS (Filtered Approximate Nearest Neighbor Search) research lacks realistic, diverse benchmark datasets based on modern Transformer-derived text embeddings. Method: This paper introduces the first large-scale arXiv abstract FANNS benchmark—comprising 2.7 million samples and 11 real-world filtering attributes—filling a critical gap in evaluating FANNS over high-dimensional Transformer embeddings. We systematically evaluate state-of-the-art algorithms—including ACORN, SeRF, Filtered-DiskANN, and UNG—under a unified framework using real text embeddings and multi-dimensional filtering queries. Contribution/Results: No single FANNS method dominates across all filtering types and dataset scales, exposing structural limitations of current approaches in handling large-scale, high-dimensional Transformer embeddings. Our benchmark and empirical analysis provide a foundational resource for future algorithm design, rigorous evaluation, and reproducible FANNS research.

Technology Category

Application Category

📝 Abstract
Advances in embedding models for text, image, audio, and video drive progress across multiple domains, including retrieval-augmented generation, recommendation systems, vehicle/person reidentification, and face recognition. Many applications in these domains require an efficient method to retrieve items that are close to a given query in the embedding space while satisfying a filter condition based on the item's attributes, a problem known as Filtered Approximate Nearest Neighbor Search (FANNS). In this work, we present a comprehensive survey and taxonomy of FANNS methods and analyze how they are benchmarked in the literature. By doing so, we identify a key challenge in the current FANNS landscape: the lack of diverse and realistic datasets, particularly ones derived from the latest transformer-based text embedding models. To address this, we introduce a novel dataset consisting of embedding vectors for the abstracts of over 2.7 million research articles from the arXiv repository, accompanied by 11 real-world attributes such as authors and categories. We benchmark a wide range of FANNS methods on our novel dataset and find that each method has distinct strengths and limitations; no single approach performs best across all scenarios. ACORN, for example, supports various filter types and performs reliably across dataset scales but is often outperformed by more specialized methods. SeRF shows excellent performance for range filtering on ordered attributes but cannot handle categorical attributes. Filtered-DiskANN and UNG excel on the medium-scale dataset but fail on the large-scale dataset, highlighting the challenge posed by transformer-based embeddings, which are often more than an order of magnitude larger than earlier embeddings. We conclude that no universally best method exists.
Problem

Research questions and friction points this paper is trying to address.

Evaluating FANNS methods on transformer-based embedding vectors
Addressing lack of diverse datasets for FANNS benchmarking
Comparing performance of FANNS algorithms across different scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Survey and taxonomy of FANNS methods
Novel dataset with transformer-based embeddings
Benchmarking diverse FANNS algorithms
🔎 Similar Papers
No similar papers found.