Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of traditional retrieval-augmented generation (RAG) in multi-choice decision tasks, where reliance on the initial query alone often fails to retrieve discriminative evidence. The authors propose HCQR, a training-free pre-retrieval framework that introduces a hypothesis-driven mechanism into RAG for the first time. By leveraging a large language model, HCQR generates lightweight hypotheses from the question and candidate options, then reformulates them into three targeted queries—supporting the hypothesis, distinguishing among competing options, and verifying critical clues—to guide retrieval toward decision-relevant evidence. The approach seamlessly integrates with existing retrievers and generators without requiring additional training. Experiments show HCQR outperforms Simple RAG by 5.9 and 3.6 percentage points on MedQA and MMLU-Med, respectively, significantly surpassing single-query RAG as well as re-ranking and filtering baselines.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) improves Large Language Models (LLMs) by grounding generation in external, non-parametric knowledge. However, when a task requires choosing among competing options, simply grounding generation in broadly relevant context is often insufficient to drive the final decision. Existing RAG methods typically rely on a single initial query, which often favors topical relevance over decision-relevant evidence, and therefore retrieves background information that can fail to discriminate among answer options. To address this issue, here we propose Hypothesis-Conditioned Query Rewriting (HCQR), a training-free pre-retrieval framework that reorients RAG from topic-oriented retrieval to evidence-oriented retrieval. HCQR first derives a lightweight working hypothesis from the input question and candidate options, and then rewrites retrieval into three targeted queries that seek evidence to: (1) support the hypothesis, (2) distinguish it from competing alternatives, and (3) verify salient clues in the question. This approach enables context retrieval that is more directly aligned with answer selection, allowing the generator to confirm or overturn the initial hypothesis based on the retrieved evidence. Experiments on MedQA and MMLU-Med show that HCQR consistently outperforms single-query RAG and re-rank/filter baselines, improving average accuracy over Simple RAG by 5.9 and 3.6 points, respectively. Code is available at https://anonymous.4open.science/r/HCQR-1C2E.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
decision-making
query rewriting
evidence retrieval
answer selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypothesis-Conditioned Query Rewriting
Retrieval-Augmented Generation
Evidence-Oriented Retrieval
Query Rewriting
Decision-Useful Retrieval
🔎 Similar Papers
No similar papers found.
H
Hangeol Chang
Graduate School of AI, KAIST, Republic of Korea
C
Changsun Lee
Graduate School of AI, KAIST, Republic of Korea
S
Seungjoon Rho
School of Electrical Engineering, KAIST, Republic of Korea
J
Junho Yeo
Department of Industrial Engineering, Yonsei University, Republic of Korea
Jong Chul Ye
Jong Chul Ye
Professor, Chung Moon Soul Chair, Graduate School of AI, KAIST
machine learningcomputational imagingmedical imagingsignal processingcompressed sensing