LLM-based Query Expansion Fails for Unfamiliar and Ambiguous Queries

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies fundamental limitations of large language model (LLM)-driven query expansion (QE) in two prevalent failure scenarios: (1) erroneous expansions due to LLM knowledge gaps, and (2) bias-prone refinements that narrow retrieval scope under query semantic ambiguity. Method: To systematically dissect these failures, the authors first formally distinguish and empirically validate *knowledge insufficiency* and *ambiguity-induced bias* as orthogonal root causes; they then propose a novel QE evaluation framework jointly measuring knowledge coverage and ambiguity robustness, validated via controlled experiments across multiple benchmarks and both sparse (BM25) and dense (ColBERT) retrieval models. Contribution/Results: Quantitative analysis reveals that under knowledge-poor or highly ambiguous queries, NDCG@10 degrades by 18.7% on average—providing critical failure diagnostics and actionable guidance for improving LLM-augmented retrieval.

Technology Category

Application Category

📝 Abstract
Query expansion (QE) enhances retrieval by incorporating relevant terms, with large language models (LLMs) offering an effective alternative to traditional rule-based and statistical methods. However, LLM-based QE suffers from a fundamental limitation: it often fails to generate relevant knowledge, degrading search performance. Prior studies have focused on hallucination, yet its underlying cause--LLM knowledge deficiencies--remains underexplored. This paper systematically examines two failure cases in LLM-based QE: (1) when the LLM lacks query knowledge, leading to incorrect expansions, and (2) when the query is ambiguous, causing biased refinements that narrow search coverage. We conduct controlled experiments across multiple datasets, evaluating the effects of knowledge and query ambiguity on retrieval performance using sparse and dense retrieval models. Our results reveal that LLM-based QE can significantly degrade the retrieval effectiveness when knowledge in the LLM is insufficient or query ambiguity is high. We introduce a framework for evaluating QE under these conditions, providing insights into the limitations of LLM-based retrieval augmentation.
Problem

Research questions and friction points this paper is trying to address.

LLM-based query expansion fails for unfamiliar queries
Ambiguous queries cause biased refinements in LLM-QE
LLM knowledge gaps degrade search performance significantly
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based query expansion fails for unfamiliar queries
Evaluates QE under knowledge deficiency and ambiguity
Proposes framework for LLM-based retrieval augmentation
🔎 Similar Papers
No similar papers found.