🤖 AI Summary
To address challenges in conversational search—including difficult intent understanding, topic drift, and semantic ambiguity—this paper proposes a lightweight, context-aware retrieval framework. Methodologically: (1) we design a context-aware embedding mechanism that deeply integrates dialogue history; (2) we introduce a query-rewriting–based intent supervision strategy to explicitly model user intent evolution without increasing inference overhead; and (3) we jointly leverage intent-guided training and context-enhanced embeddings to preserve large language models’ generative capabilities while improving retrieval effectiveness. Extensive experiments on benchmark conversational search datasets—including MSDialog and TopiOCQA—demonstrate significant performance gains over state-of-the-art methods, confirming the framework’s robustness and generalizability across diverse dialogue scenarios.
📝 Abstract
Effective conversational search demands a deep understanding of user intent across multiple dialogue turns. Users frequently use abbreviations and shift topics in the middle of conversations, posing challenges for conventional retrievers. While query rewriting techniques improve clarity, they often incur significant computational cost due to additional autoregressive steps. Moreover, although LLM-based retrievers demonstrate strong performance, they are not explicitly optimized to track user intent in multi-turn settings, often failing under topic drift or contextual ambiguity. To address these limitations, we propose ContextualRetriever, a novel LLM-based retriever that directly incorporates conversational context into the retrieval process. Our approach introduces: (1) a context-aware embedding mechanism that highlights the current query within the dialogue history; (2) intent-guided supervision based on high-quality rewritten queries; and (3) a training strategy that preserves the generative capabilities of the base LLM. Extensive evaluations across multiple conversational search benchmarks demonstrate that ContextualRetriever significantly outperforms existing methods while incurring no additional inference overhead.