LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval

📅 2025-02-15

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work addresses the low efficiency and high memory overhead inherent in fusing dense and sparse retrieval. We propose CluSD, a novel two-stage clustering-based selection framework guided by sparse retrieval (e.g., BM25). It employs a lightweight LSTM model to rapidly identify semantically relevant clusters and dynamically triggers localized dense retrieval and block-level disk I/O only where needed. This enables dynamic partial dense retrieval with minimal memory overhead (<8% increase). On MS MARCO and BEIR benchmarks, CluSD achieves 2.3× faster retrieval speed over baseline methods while maintaining state-of-the-art (SOTA) performance in mAP and Recall@100. The approach thus uniquely balances retrieval accuracy, latency, and resource efficiency—advancing the practical deployment of hybrid retrieval systems.

Technology Category

Application Category

📝 Abstract

This paper studies fast fusion of dense retrieval and sparse lexical retrieval, and proposes a cluster-based selective dense retrieval method called CluSD guided by sparse lexical retrieval. CluSD takes a lightweight cluster-based approach and exploits the overlap of sparse retrieval results and embedding clusters in a two-stage selection process with an LSTM model to quickly identify relevant clusters while incurring limited extra memory space overhead. CluSD triggers partial dense retrieval and performs cluster-based block disk I/O if needed. This paper evaluates CluSD and compares it with several baselines for searching in-memory and on-disk MS MARCO and BEIR datasets.

Problem

Research questions and friction points this paper is trying to address.

Fusion of dense and sparse retrieval

Cluster-based selective dense retrieval

Efficient memory and disk I/O optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

LSTM-guided dense retrieval

Cluster-based selective retrieval

Sparse lexical retrieval fusion

🔎 Similar Papers

No similar papers found.