🤖 AI Summary
This work addresses the low efficiency and high memory overhead inherent in fusing dense and sparse retrieval. We propose CluSD, a novel two-stage clustering-based selection framework guided by sparse retrieval (e.g., BM25). It employs a lightweight LSTM model to rapidly identify semantically relevant clusters and dynamically triggers localized dense retrieval and block-level disk I/O only where needed. This enables dynamic partial dense retrieval with minimal memory overhead (<8% increase). On MS MARCO and BEIR benchmarks, CluSD achieves 2.3× faster retrieval speed over baseline methods while maintaining state-of-the-art (SOTA) performance in mAP and Recall@100. The approach thus uniquely balances retrieval accuracy, latency, and resource efficiency—advancing the practical deployment of hybrid retrieval systems.
📝 Abstract
This paper studies fast fusion of dense retrieval and sparse lexical retrieval, and proposes a cluster-based selective dense retrieval method called CluSD guided by sparse lexical retrieval. CluSD takes a lightweight cluster-based approach and exploits the overlap of sparse retrieval results and embedding clusters in a two-stage selection process with an LSTM model to quickly identify relevant clusters while incurring limited extra memory space overhead. CluSD triggers partial dense retrieval and performs cluster-based block disk I/O if needed. This paper evaluates CluSD and compares it with several baselines for searching in-memory and on-disk MS MARCO and BEIR datasets.