🤖 AI Summary
This study addresses the challenge of accurately modeling individual personality traits—particularly Agreeableness and Neuroticism from the Big Five framework—in personalized large language models (LLMs). We propose a Thought-as-Utterance (TAU) augmentation method: inserting silent, pre-speech internal monologues—serving as explicit proxies for implicit cognitive processes—into original dialogues. To our knowledge, this is the first systematic application of TAU to personality modeling. Experiments demonstrate that TAU augmentation significantly improves the fidelity of persona LLMs in capturing users’ true Agreeableness and Neuroticism scores; moreover, TAU quality directly governs modeling performance. The approach establishes a novel paradigm for text-based, personality-aware modeling, offering both theoretical insight into the role of internal cognition in linguistic expression and practical feasibility for deployment in conversational AI systems.
📝 Abstract
This study proposes augmenting dialog data with think-aloud utterances (TAUs) for modeling individual personalities in text chat by LLM. TAU is a verbalization of a speaker's thought before articulating the utterance. We expect "persona LLMs" trained with TAU-augmented data can mimic the speaker's personality trait better. We tested whether the trained persona LLMs obtain the human personality with respect to Big Five, a framework characterizing human personality traits from five aspects. The results showed that LLMs trained with TAU-augmented data more closely align to the speakers' Agreeableness and Neuroticism of Big Five than those trained with original dialog data. We also found that the quality of TAU-augmentation impacts persona LLM's performance.