Bridging the Gap: From Ad-hoc to Proactive Search in Conversations

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In proactive search in conversations (PSC), ad-hoc retrievers suffer significant performance degradation due to semantic mismatch between their pretraining regime—optimized for short, precise queries—and the long, noisy conversational contexts typical of PSC. Method: This paper proposes Conv2Query, a plug-and-play dialogue-to-query mapping framework. It is the first to systematically identify and model this input mismatch, leveraging a sequence-to-sequence architecture to understand dialogues and generate concise, retrieval-oriented queries. The framework employs supervised fine-tuning jointly optimized with the downstream retriever to enhance generalization, without modifying or retraining the existing retriever. Contribution/Results: Conv2Query effectively bridges the semantic gap between conversational inputs and retrieval models. On two PSC benchmarks, it achieves up to 18.3% absolute improvement in Recall@10. Both direct deployment and joint fine-tuning yield state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Proactive search in conversations (PSC) aims to reduce user effort in formulating explicit queries by proactively retrieving useful relevant information given conversational context. Previous work in PSC either directly uses this context as input to off-the-shelf ad-hoc retrievers or further fine-tunes them on PSC data. However, ad-hoc retrievers are pre-trained on short and concise queries, while the PSC input is longer and noisier. This input mismatch between ad-hoc search and PSC limits retrieval quality. While fine-tuning on PSC data helps, its benefits remain constrained by this input gap. In this work, we propose Conv2Query, a novel conversation-to-query framework that adapts ad-hoc retrievers to PSC by bridging the input gap between ad-hoc search and PSC. Conv2Query maps conversational context into ad-hoc queries, which can either be used as input for off-the-shelf ad-hoc retrievers or for further fine-tuning on PSC data. Extensive experiments on two PSC datasets show that Conv2Query significantly improves ad-hoc retrievers' performance, both when used directly and after fine-tuning on PSC.
Problem

Research questions and friction points this paper is trying to address.

Proactive search in conversations reduces user query effort
Ad-hoc retrievers mismatch noisy PSC input, limiting retrieval quality
Conv2Query bridges input gap between ad-hoc search and PSC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Converts conversation context into ad-hoc queries
Bridges input gap for ad-hoc retrievers
Improves retrieval performance significantly
🔎 Similar Papers
No similar papers found.