LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address redundancy and insufficient diversity in Approximate Nearest Neighbor Search (ANNS) results—particularly in Retrieval-Augmented Generation (RAG) scenarios—this paper proposes a lightweight, greedy post-processing method based on a precomputed truncation table. The method employs a learned truncation table to rapidly filter candidate sets via lookup, explicitly optimizing result diversity while preserving query relevance. Its architecture avoids costly re-ranking, requiring only a single embedding inference (e.g., OpenAI embeddings). Evaluated in realistic RAG settings, it achieves an average latency of just 0.02 ms per query, improves diversity metrics (e.g., inter-list distance ↑32%), and maintains high retrieval accuracy (Recall@10 degradation <0.5%). To the best of our knowledge, this is the first work to integrate truncation tables into ANNS diversity-aware post-processing, achieving a principled trade-off among efficiency, relevance, and diversity.

Technology Category

Application Category

📝 Abstract
Approximate nearest neighbor search (ANNS) is an essential building block for applications like RAG but can sometimes yield results that are overly similar to each other. In certain scenarios, search results should be similar to the query and yet diverse. We propose LotusFilter, a post-processing module to diversify ANNS results. We precompute a cutoff table summarizing vectors that are close to each other. During the filtering, LotusFilter greedily looks up the table to delete redundant vectors from the candidates. We demonstrated that the LotusFilter operates fast (0.02 [ms/query]) in settings resembling real-world RAG applications, utilizing features such as OpenAI embeddings. Our code is publicly available at https://github.com/matsui528/lotf.
Problem

Research questions and friction points this paper is trying to address.

Diversifies overly similar nearest neighbor search results
Speeds up filtering via precomputed cutoff tables
Optimizes performance for real-world RAG applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned cutoff table for vector summarization
Greedy lookup to remove redundant vectors
Fast post-processing for diverse ANNS results