Know You Before You Speak: User-State Modeling for LLM Personalization in Multi-Turn Conversation

πŸ“… 2026-05-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing personalized dialogue systems struggle to model the dynamic evolution of users’ latent states, often relying on static profiles or explicit memory and lacking proactive decision-making capabilities for future interactions. This work proposes PUMA, a novel framework that introduces the free energy principle to dialogue personalization by formulating the task as a partially observable sequential decision-making process. PUMA represents user states through latent variables and integrates action-conditioned state transitions with Bayesian belief updating, guiding dialogue policy by minimizing expected free energy. This approach shifts the paradigm from passive response retrieval to active, state-evolution-driven decision-making, unifying cognitive exploration with task-oriented objectives. Experimental results demonstrate that PUMA significantly improves long-term dialogue performance on healthcare consultation and motivational interviewing datasets, achieving superior response quality, user state estimation, and next-state prediction.
πŸ“ Abstract
Personalized dialogue requires more than recalling explicit user histories: systems also need to infer hidden user states that evolve through interaction and shape appropriate response strategies. Existing memory- and profile-based methods primarily reuse observable user information, offering limited support for modeling user-state dynamics or selecting actions based on how they shape future user states. We propose PUMA (Prospective User-state Modeling for Action selection), a framework grounded in the Free Energy Principle (FEP) that formulates personalization as decision-making under partial observability, centered on an explicit user state model that captures latent user states and their action-conditioned dynamics. At each turn, PUMA maintains a belief over the user's hidden state, refines the user state model for observation generation and action-conditioned state transition, and selects dialogue actions by minimizing expected free energy, balancing epistemic and pragmatic objectives under a unified criterion. This formulation shifts personalization from passive memory retrieval to model-based decision-making over user evolution. We instantiate PUMA on healthcare-oriented counseling and motivational interviewing benchmarks with latent state annotations for rigorous evaluation. Experiments show that PUMA improves long-horizon dialogue outcomes while maintaining strong response quality, and a cross-dataset study demonstrates more reliable user-state estimation and next-state prediction.
Problem

Research questions and friction points this paper is trying to address.

personalized dialogue
user-state modeling
multi-turn conversation
latent user states
dialogue personalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

user-state modeling
personalized dialogue
Free Energy Principle
action selection
latent state dynamics
πŸ”Ž Similar Papers
No similar papers found.
J
Jiani Luo
School of Computing, National University of Singapore, Singapore
X
Xiaoyan Zhao
School of Computing, National University of Singapore, Singapore
Yang Zhang
Yang Zhang
National University of Singapore
RecommendationLLM PersonalizationTrustworthy
S
Shuyi Miao
School of Artificial Intelligence, Beihang University, Beijing, China
Bingbing Xu
Bingbing Xu
Associate professor, Institute of Computing Technology, Chinese Academy of Sciences
Graph Neural NetworksNetwork Embedding
S
Stefan Konigorski
German Institute of Human Nutrition Potsdam-Rehbruecke, Germany
Tat-Seng Chua
Tat-Seng Chua
National University of Singapore
Multimedia Information RetrievalLive Social Media Analysis