Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenges of intent entanglement and noise in user behavior logs by proposing a hierarchical multi-role profiling framework. The approach first aggregates behavioral logs to construct an intent memory, then employs hierarchical clustering coupled with natural language labeling to generate multiple evidence-grounded roles. Role generation is formulated as an optimization problem, and the framework introduces a novel group-based Direct Preference Optimization (Group-based DPO) training strategy to enhance role consistency, alignment with supporting evidence, and overall authenticity. Experimental results on large-scale service logs and two public datasets demonstrate that the generated roles significantly outperform baseline methods in coherence, credibility, and evidential grounding, while also yielding improved performance in future interaction prediction tasks.

📝 Abstract

Behavioral logs provide rich signals for user modeling, but are noisy and interleaved across diverse intents. Recent work uses LLMs to generate interpretable natural-language personas from user logs, yet evaluation often emphasizes downstream utility, providing limited assurance of persona quality itself. We propose a hierarchical framework that aggregates user actions into intent memories and induces multiple evidence-grounded personas by clustering and labeling these memories. We formulate persona induction as an optimization problem over persona quality-captured by cluster cohesion, persona-evidence alignment, and persona truthfulness-and train the persona model using a groupwise extension of Direct Preference Optimization (DPO). Experiments on a large-scale service log and two public datasets show that our method induces more coherent, evidence-grounded, and trustworthy personas, while also improving future interaction prediction.

Problem

Research questions and friction points this paper is trying to address.

user behavioral logs

persona induction

evidence-grounded personas

truthful personas

multi-persona modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical persona induction

evidence-grounded personas

intent memory clustering