Enhancing User Engagement in Socially-Driven Dialogue through Interactive LLM Alignments

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low user engagement in social-driven dialogue systems, this paper proposes an interactive alignment method that leverages real-time user feedback as a reward signal. Methodologically, it introduces (1) the first “intent-response-as-reward” interactive alignment paradigm; (2) interactive Monte Carlo Tree Search (i×MCTS), which simulates dialogue evolution to generate high-quality preference data; and (3) an end-to-end framework integrating a user simulator, i×MCTS, Direct Preference Optimization (DPO), and interactive fine-tuning. Evaluated on empathetic support and benevolent persuasion tasks, the approach achieves substantial improvements over state-of-the-art baselines: +28.6% in session duration, +34.1% in response engagement, and +22.3% in dialogue completion rate—demonstrating comprehensive gains in user retention and interaction quality.

Technology Category

Application Category

📝 Abstract
Enhancing user engagement through interactions plays an essential role in socially-driven dialogues. While prior works have optimized models to reason over relevant knowledge or plan a dialogue act flow, the relationship between user engagement and knowledge or dialogue acts is subtle and does not guarantee user engagement in socially-driven dialogues. To this end, we enable interactive LLMs to learn user engagement by leveraging signals from the future development of conversations. Specifically, we adopt a more direct and relevant indicator of user engagement, i.e., the user's reaction related to dialogue intention after the interaction, as a reward to align interactive LLMs. To achieve this, we develop a user simulator to interact with target interactive LLMs and explore interactions between the user and the interactive LLM system via extit{i$ imes$MCTS} ( extit{M}onte extit{C}arlo extit{T}ree extit{S}earch for extit{i}nteraction). In this way, we collect a dataset containing pairs of higher and lower-quality experiences using extit{i$ imes$MCTS}, and align interactive LLMs for high-level user engagement by direct preference optimization (DPO) accordingly. Experiments conducted on two socially-driven dialogue scenarios (emotional support conversations and persuasion for good) demonstrate that our method effectively enhances user engagement in interactive LLMs.
Problem

Research questions and friction points this paper is trying to address.

Enhancing user engagement in socially-driven dialogues
Aligning LLMs using future conversation signals
Optimizing dialogue interactions via user reaction rewards
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interactive LLMs learn from future conversation signals
User simulator explores interactions via i×MCTS algorithm
Align LLMs using DPO with quality-ranked experience pairs
🔎 Similar Papers
No similar papers found.
J
Jiashuo Wang
Department of Computing, The Hong Kong Polytechnic University
Kaitao Song
Kaitao Song
Senior Researcher, Microsoft Research
Natural Language ProcessingLarge Language ModelsArtificial General Intelligence
Chunpu Xu
Chunpu Xu
PolyU
Multimodal learningNatural language processing
C
Changhe Song
Department of Computing, The Hong Kong Polytechnic University
Y
Yang Xiao
Department of Computing, The Hong Kong Polytechnic University
D
Dongsheng Li
Microsoft Research Asia
Lili Qiu
Lili Qiu
NAI Fellow, ACM Fellow, IEEE Fellow, Professor, Dept. of Computer Science, The University of Texas
Wireless NetworksWireless SensingMobile ComputingSystems5G
W
Wenjie Li
Department of Computing, The Hong Kong Polytechnic University