With Argus Eyes: Assessing Retrieval Gaps via Uncertainty Scoring to Detect and Remedy Retrieval Blind Spots

๐Ÿ“… 2026-02-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the โ€œretrieval blind spotโ€ problem in neural retrievers within RAG systems, where relevant entities are often missed due to low embedding similarity. For the first time, the authors analyze this issue through the geometric structure of the embedding space and introduce the Retrieval Probability Score (RPS)โ€”a metric that predicts retrieval risk without performing actual retrieval. Building on this insight, they propose the ARGUS pipeline, which leverages knowledge bases such as Wikidata to selectively enrich high-risk documents, thereby enhancing the retrievability of pertinent entities. Evaluated on the BRIGHT, IMPLIRET, and RAR-B benchmarks, ARGUS consistently improves the performance of mainstream retrievers, yielding average gains of +3.4 nDCG@5 and +4.5 nDCG@10, with particularly pronounced improvements on challenging subsets.

Technology Category

Application Category

๐Ÿ“ Abstract
Reliable retrieval-augmented generation (RAG) systems depend fundamentally on the retriever's ability to find relevant information. We show that neural retrievers used in RAG systems have blind spots, which we define as the failure to retrieve entities that are relevant to the query, but have low similarity to the query embedding. We investigate the training-induced biases that cause such blind spot entities to be mapped to inaccessible parts of the embedding space, resulting in low retrievability. Using a large-scale dataset constructed from Wikidata relations and first paragraphs of Wikipedia, and our proposed Retrieval Probability Score (RPS), we show that blind spot risk in standard retrievers (e.g., CONTRIEVER, REASONIR) can be predicted pre-index from entity embedding geometry, avoiding expensive retrieval evaluations. To address these blind spots, we introduce ARGUS, a pipeline that enables the retrievability of high-risk (low-RPS) entities through targeted document augmentation from a knowledge base (KB), first paragraphs of Wikipedia, in our case. Extensive experiments on BRIGHT, IMPLIRET, and RAR-B show that ARGUS achieves consistent improvements across all evaluated retrievers (averaging +3.4 nDCG@5 and +4.5 nDCG@10 absolute points), with substantially larger gains in challenging subsets. These results establish that preemptively remedying blind spots is critical for building robust and trustworthy RAG systems (Code and Data).
Problem

Research questions and friction points this paper is trying to address.

retrieval blind spots
retrieval-augmented generation
neural retrievers
entity retrievability
embedding space
Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval blind spots
Retrieval Probability Score (RPS)
embedding geometry
targeted document augmentation
retrieval-augmented generation (RAG)
๐Ÿ”Ž Similar Papers
No similar papers found.
Z
Zeinab Sadat Taghavi
Lucerne University of Applied Sciences and Arts (HSLU); Center for Information and Language Processing (CIS), Ludwig Maximilian University of Munich (LMU); Munich Center for Machine Learning (MCML)
Ali Modarressi
Ali Modarressi
PhD student at LMU Munich
Natural Language ProcessingDeep LearningArtificial Intelligence
H
Hinrich Schutze
Center for Information and Language Processing (CIS), Ludwig Maximilian University of Munich (LMU); Munich Center for Machine Learning (MCML)
A
Andreas Marfurt
Lucerne University of Applied Sciences and Arts (HSLU)