🤖 AI Summary
Existing approaches to conversational query rewriting often overlook the feedback loop among rewriting, retrieval, and response generation, limiting performance gains. This work addresses this gap by first constructing a self-consistent preference alignment dataset that jointly captures interactions among these three components. Building upon this, we propose a prefix-guided, multi-dimensional direct preference optimization method to enable unified modeling of query rewriting, retrieval, and response generation. Our approach significantly enhances both the diversity and consistency of rewritten queries and substantially improves conversational search effectiveness in both in-distribution and out-of-distribution settings, demonstrating strong empirical validity and generalization capability.
📝 Abstract
Conversational Query Rewriting (CQR) aims to rewrite ambiguous queries to achieve more efficient conversational search. Early studies have predominantly focused on the rewriting in isolation, ignoring the feedback from query rewrite, passage retrieval and response generation in the rewriting process. To address this issue, we propose Multi-Faceted Self-Consistent Preference Aligned CQR (MSPA-CQR). Specifically, we first construct self-consistent preference alignment data from three dimensions (rewriting, retrieval, and response) to generate more diverse rewritten queries. Then we propose prefix guided multi-faceted direct preference optimization to learn preference information from three different dimensions. The experimental results show that our MSPA-CQR is effective in both in- and out-of-distribution scenarios.