Aligned Query Expansion: Efficient Query Expansion for Information Retrieval through LLM Alignment

📅 2025-07-15

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

In information retrieval, LLM-based query expansion via greedy decoding suffers from hallucination, while mainstream “generate-then-filter” paradigms incur high computational overhead and lack guidance during generation. This paper proposes Aligned Query Expansion (AQE), the first method to directly integrate LLM alignment techniques—such as Reinforcement Learning from Human Feedback (RLHF) and supervised fine-tuning—into query expansion, enabling end-to-end optimization of expansion relevance without post-hoc filtering. AQE jointly models semantic alignment and lexical matching, leveraging instruction tuning to guide the model toward generating more precise and retrieval-friendly expansion terms. Evaluated on multiple open-domain question answering benchmarks, AQE consistently outperforms strong baselines, improving paragraph retrieval accuracy (e.g., NDCG@10 by 3.2–5.8%) in both in-domain and zero-shot cross-domain settings. Moreover, it reduces inference latency and GPU memory consumption.

Technology Category

Application Category

📝 Abstract

With the breakthroughs in large language models (LLMs), query generation techniques that expand documents and queries with related terms are becoming increasingly popular in the information retrieval field. Such techniques have been shown to improve the effectiveness of traditional lexical retrieval methods by dealing with the vocabulary mismatch problem. Recent work has found that generating queries with a greedy decoding strategy can produce sub-optimal queries, including hallucinations, and proposed to filter out queries before expansion. This `generate-then-filter' approach is costly, as it requires generating multiple queries and applying a relevance model to all of them and does not teach the LLM which of the generated queries is more effective for expansion. To overcome such limitations, we propose Aligned Query Expansion (AQE), a novel approach to enhance query expansion for passage retrieval in open-domain question answering. AQE leverages recent techniques in LLM alignment to fine-tune models for generating query expansions that directly optimize the effectiveness of the retrieval task, eliminating the need for additional filtering steps. This alignment ensures that queries are more relevant, reducing computational costs while improving retrieval effectiveness. Empirical evaluations show that AQE outperforms baseline models for query expansion in both in-domain and out-of-domain settings, demonstrating significant improvements in retrieval effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Overcoming sub-optimal query generation in information retrieval

Reducing computational costs of query expansion filtering

Improving retrieval effectiveness with aligned query expansion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns LLM to optimize query expansion directly

Eliminates need for costly generate-then-filter steps

Improves retrieval effectiveness across diverse domains

🔎 Similar Papers

No similar papers found.