π€ AI Summary
Existing open-domain dialogue systems exhibit limited capability in proactively identifying user preferences and steering conversations, leading to perceived neglect and reduced user satisfaction. To address this, we propose a user-centric proactive dialogue framework featuring: (1) a novel critic-guided proactivity enhancement paradigm, where an LLM-as-a-judge dynamically evaluates and guides response generation; (2) ISCO-800βthe first dataset explicitly designed for user background modeling; and (3) an iterative curriculum learning strategy grounded in communication difficulty, integrating user simulation with multi-source preference modeling. Experiments demonstrate substantial improvements across multiple large language models in proactivity, topic steerability, and conversational engagement, with strong generalization to diverse open-domain scenarios.
π Abstract
Open-domain dialogue systems aim to generate natural and engaging conversations, providing significant practical value in real applications such as social robotics and personal assistants. The advent of large language models (LLMs) has greatly advanced this field by improving context understanding and conversational fluency. However, existing LLM-based dialogue systems often fall short in proactively understanding the user's chatting preferences and guiding conversations toward user-centered topics. This lack of user-oriented proactivity can lead users to feel unappreciated, reducing their satisfaction and willingness to continue the conversation in human-computer interactions. To address this issue, we propose a User-oriented Proactive Chatbot (UPC) to enhance the user-oriented proactivity. Specifically, we first construct a critic to evaluate this proactivity inspired by the LLM-as-a-judge strategy. Given the scarcity of high-quality training data, we then employ the critic to guide dialogues between the chatbot and user agents, generating a corpus with enhanced user-oriented proactivity. To ensure the diversity of the user backgrounds, we introduce the ISCO-800, a diverse user background dataset for constructing user agents. Moreover, considering the communication difficulty varies among users, we propose an iterative curriculum learning method that trains the chatbot from easy-to-communicate users to more challenging ones, thereby gradually enhancing its performance. Experiments demonstrate that our proposed training method is applicable to different LLMs, improving user-oriented proactivity and attractiveness in open-domain dialogues.