🤖 AI Summary
Modern information retrieval faces lexical mismatch between short queries and dynamic, heterogeneous corpora, necessitating query expansion (QE) methods adapted to the large language model (LLM) era. This paper proposes a four-dimensional analytical framework—covering injection mechanisms, knowledge fusion, alignment learning, and knowledge graph grounding—to systematically survey PLM/LLM-driven QE techniques across encoder-only, encoder-decoder, decoder-only, instruction-tuned, and multilingual architectures, while integrating external knowledge bases, conversational interaction, and knowledge graphs. Experiments span seven retrieval scenarios, demonstrating LLMs’ strengths in zero-shot QE and controllable generation, and identifying robust strategies to mitigate topic drift and hallucination. The work establishes a unified taxonomy for QE and opens new research directions in quality control, cost optimization, domain adaptation, and fairness evaluation.
📝 Abstract
Modern information retrieval (IR) must bridge short, ambiguous queries and ever more diverse, rapidly evolving corpora. Query Expansion (QE) remains a key mechanism for mitigating vocabulary mismatch, but the design space has shifted markedly with pre-trained language models (PLMs) and large language models (LLMs). This survey synthesizes the field from three angles: (i) a four-dimensional framework of query expansion - from the point of injection (explicit vs. implicit QE), through grounding and interaction (knowledge bases, model-internal capabilities, multi-turn retrieval) and learning alignment, to knowledge graph-based argumentation; (ii) a model-centric taxonomy spanning encoder-only, encoder-decoder, decoder-only, instruction-tuned, and domain/multilingual variants, highlighting their characteristic affordances for QE (contextual disambiguation, controllable generation, zero-/few-shot reasoning); and (iii) practice-oriented guidance on where and how neural QE helps in first-stage retrieval, multi-query fusion, re-ranking, and retrieval-augmented generation (RAG). We compare traditional query expansion with PLM/LLM-based methods across seven key aspects, and we map applications across web search, biomedicine, e-commerce, open-domain QA/RAG, conversational and code search, and cross-lingual settings. The review distills design grounding and interaction, alignment/distillation (SFT/PEFT/DPO), and KG constraints - as robust remedies to topic drift and hallucination. We conclude with an agenda on quality control, cost-aware invocation, domain/temporal adaptation, evaluation beyond end-task metrics, and fairness/privacy. Collectively, these insights provide a principled blueprint for selecting and combining QE techniques under real-world constraints.