Tailored Conversations beyond LLMs: A RL-Based Dialogue Manager

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Goal-oriented open-domain dialogue systems struggle to simultaneously achieve user personalization, phase adaptability, and low-data learning. Method: This paper proposes a novel framework integrating large language models (LLMs) with a hierarchical reinforcement learning (HRL)-based dialogue manager. Specifically: (i) HRL models multi-phase dialogue policies to enable smooth, goal-driven transitions across phases; (ii) a meta-learning mechanism enables rapid personalization across diverse user profiles; and (iii) an LLM–HRL co-architecture decouples semantic generation from policy decision-making, reducing reliance on annotated dialogue data. Results: Evaluated on motivational interviewing tasks, the proposed dialogue manager achieves significantly higher reward scores than state-of-the-art LLM-based baselines, demonstrating superior performance in goal completion rate, user adaptability, and data efficiency.

Technology Category

Application Category

📝 Abstract

In this work, we propose a novel framework that integrates large language models (LLMs) with an RL-based dialogue manager for open-ended dialogue with a specific goal. By leveraging hierarchical reinforcement learning to model the structured phases of dialogue and employ meta-learning to enhance adaptability across diverse user profiles, our approach enhances adaptability and efficiency, enabling the system to learn from limited data, transition fluidly between dialogue phases, and personalize responses to heterogeneous patient needs. We apply our framework to Motivational Interviews, aiming to foster behavior change, and demonstrate that the proposed dialogue manager outperforms a state-of-the-art LLM baseline in terms of reward, showing a potential benefit of conditioning LLMs to create open-ended dialogue systems with specific goals.

Problem

Research questions and friction points this paper is trying to address.

Integrate LLMs with RL for goal-oriented open-ended dialogue

Enhance adaptability using meta-learning for diverse user profiles

Improve efficiency in behavior change Motivational Interviews

Innovation

Methods, ideas, or system contributions that make the work stand out.

RL-based dialogue manager for goal-oriented conversations

Hierarchical reinforcement learning for structured dialogue phases

Meta-learning for adapting to diverse user profiles

🔎 Similar Papers

A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems

2024-02-28arXiv.orgCitations: 93

Nvidia

30 USD - 94 USD

US, CA, Santa Clara

Research Engineer, Language - Personalization, Meta Superintelligence Labs