🤖 AI Summary
Dense retrievers suffer from limited performance on entity-centric queries due to insufficient prior knowledge of low-frequency entities. To address this, we propose a knowledge-augmented retrieval framework that requires no retraining: it dynamically integrates entity information from external knowledge bases via a context-entity cross-attention mechanism and introduces a plug-and-play dynamic entity embedding module enabling real-time injection of new knowledge. Built upon the BERT/bge-base architecture, our approach preserves model lightweightness and deployment efficiency. Extensive evaluation on EntityQuestions, WebQs, and TREC-QA demonstrates significant improvements—particularly a 12.6% gain in Recall@20 on EntityQuestions—and achieves state-of-the-art results among models of comparable size on two benchmarks. The method notably enhances recall and generalization for queries involving low-frequency entities.
📝 Abstract
Dense retrievers often struggle with queries involving less-frequent entities due to their limited entity knowledge. We propose the Knowledgeable Passage Retriever (KPR), a BERT-based retriever enhanced with a context-entity attention layer and dynamically updatable entity embeddings. This design enables KPR to incorporate external entity knowledge without retraining. Experiments on three datasets show that KPR consistently improves retrieval accuracy, achieving a substantial 12.6% gain on the EntityQuestions dataset over the model without KPR extensions. When built on the off-the-shelf bge-base retriever, KPR achieves state-of-the-art performance among similarly sized models on two datasets. Code and models will be released soon.