🤖 AI Summary
This study addresses the limitation of existing language models and personality datasets, which predominantly focus on users’ static traits while neglecting the influence of dynamic psychological states in interactive contexts. Grounded in state-trait theory, the authors construct and publicly release the Chameleon dataset, comprising 5,001 psychological profiles of 1,667 Reddit users across multiple situational contexts. Leveraging psychometric methods and variance decomposition, they quantitatively demonstrate for the first time that 74% of behavioral variability stems from transient states rather than stable traits. Further evaluation of mainstream large language models and reward models reveals a pervasive “state blindness” in the former and inconsistent preference patterns in the latter when responding to identical psychological states, underscoring the critical need to incorporate dynamic state modeling in conversational AI systems.
📝 Abstract
User interactions with language models vary due to static properties of the user (trait) and the specific context of the interaction (state). However, existing persona datasets (like PersonaChat, PANDORA etc.) capture only trait, and ignore the impact of state. We introduce Chameleon, a dataset of 5,001 contextual psychological profiles from 1,667 Reddit users, each measured across multiple contexts. Using the Chameleon dataset, we present three key findings. First, inspired by Latent State-Trait theory, we decompose variance and find that 74\% is within-person(state) while only 26\% is between-person (trait). Second, we find that LLMs are state-blind: they focus on trait only, and produce similar responses regardless of state. Third, we find that reward models react to user state, but inconsistently: different models favor or penalize the same users in opposite directions. We release Chameleon to support research on affective computing, personalized dialogue, and RLHF alignment.