AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment

📅 2024-07-02

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Conversational search suffers from poor generalization in query rewriting, particularly in modeling users’ implicit intents and contextual dependencies across dialogue history. Method: This paper proposes the first sparse-dense dual-path retrieval alignment framework for conversational query rewriting. It introduces a two-stage training strategy integrating high-quality automatically generated labels with diverse candidate construction, and synergistically combines BM25-based lexical matching and BERT-based semantic retrieval. The framework is jointly optimized via contrastive learning and supervised fine-tuning. Contributions/Results: On the TopiOCQA and QReCC benchmarks, our method substantially outperforms state-of-the-art approaches, achieving significant gains in rewrite accuracy and demonstrating superior robustness across heterogeneous retrieval systems. It strikes an effective balance between inference efficiency and effectiveness, establishing a novel paradigm for conversational query rewriting.

Technology Category

Application Category

📝 Abstract

Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CQR through alignment. However, they are designed for one specific retrieval system, which potentially results in sub-optimal generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries among diverse retrieval environments through a two-stage training strategy. Moreover, two effective approaches are proposed to obtain superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental results on the TopiOCQA and QReCC datasets demonstrate that AdaCQR outperforms the existing methods in a more efficient framework, offering both quantitative and qualitative improvements in conversational query reformulation.

Problem

Research questions and friction points this paper is trying to address.

Information Retrieval

User Intent Understanding

Consistent Performance Across Tools

Innovation

Methods, ideas, or system contributions that make the work stand out.

AdaCQR

Conversational Query Reformulation

Two-step Training Approach

🔎 Similar Papers

Conversational Query Reformulation with the Guidance of Retrieved Documents