π€ AI Summary
Conversational search suffers from poor generalization in query rewriting, particularly in modeling usersβ implicit intents and contextual dependencies across dialogue history.
Method: This paper proposes the first sparse-dense dual-path retrieval alignment framework for conversational query rewriting. It introduces a two-stage training strategy integrating high-quality automatically generated labels with diverse candidate construction, and synergistically combines BM25-based lexical matching and BERT-based semantic retrieval. The framework is jointly optimized via contrastive learning and supervised fine-tuning.
Contributions/Results: On the TopiOCQA and QReCC benchmarks, our method substantially outperforms state-of-the-art approaches, achieving significant gains in rewrite accuracy and demonstrating superior robustness across heterogeneous retrieval systems. It strikes an effective balance between inference efficiency and effectiveness, establishing a novel paradigm for conversational query rewriting.
π Abstract
Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CQR through alignment. However, they are designed for one specific retrieval system, which potentially results in sub-optimal generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries among diverse retrieval environments through a two-stage training strategy. Moreover, two effective approaches are proposed to obtain superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental results on the TopiOCQA and QReCC datasets demonstrate that AdaCQR outperforms the existing methods in a more efficient framework, offering both quantitative and qualitative improvements in conversational query reformulation.