π€ AI Summary
Existing proactive dialogue systems rely on predefined keywords and neglect implicit user attributes and preferences embedded in dialogue history, hindering the establishment of long-term interpersonal intimacy. This paper introduces Memory-Aware Proactive Dialogue (MapDia), a novel task that models usersβ implicit preferences from historical interactions and proactively steers conversation topics. Our contributions are threefold: (1) we formally define the MapDia task for the first time; (2) we construct ChMapData, the first Chinese memory-enhanced proactive dialogue dataset; and (3) we propose a unified Retrieval-Augmented Generation (RAG) framework integrating historical topic summarization, memory-augmented retrieval, and timing-aware topic transition. Extensive experiments demonstrate that our approach significantly outperforms baselines in both automatic and human evaluations. To foster reproducibility and advance research on empathetic, memory-grounded proactive dialogue, we publicly release both the code and dataset.
π Abstract
Proactive dialogue systems aim to empower chatbots with the capability of leading conversations towards specific targets, thereby enhancing user engagement and service autonomy. Existing systems typically target pre-defined keywords or entities, neglecting user attributes and preferences implicit in dialogue history, hindering the development of long-term user intimacy. To address these challenges, we take a radical step towards building a more human-like conversational agent by integrating proactive dialogue systems with long-term memory into a unified framework. Specifically, we define a novel task named Memory-aware Proactive Dialogue (MapDia). By decomposing the task, we then propose an automatic data construction method and create the first Chinese Memory-aware Proactive Dataset (ChMapData). Furthermore, we introduce a joint framework based on Retrieval Augmented Generation (RAG), featuring three modules: Topic Summarization, Topic Retrieval, and Proactive Topic-shifting Detection and Generation, designed to steer dialogues towards relevant historical topics at the right time. The effectiveness of our dataset and models is validated through both automatic and human evaluations. We release the open-source framework and dataset at https://github.com/FrontierLabs/MapDia.