🤖 AI Summary
Large language models (LLMs) lack the capacity for individualized value alignment—i.e., modeling users’ idiosyncratic, context-sensitive value orientations. Method: We propose the first multi-module framework grounded in cognitive-affective personality theory, which encodes structured user data (e.g., demographics, life experiences) into inferable, narrative-based personal contexts. Our approach integrates narrative generation, structured-data encoding, multi-stage value reasoning, and cognitive-psychological mechanisms to build an end-to-end system for simulating individual values. Contribution/Results: Evaluated on a novel, self-constructed World Values Survey benchmark, our method achieves over 10% improvement in top-1 accuracy and demonstrates continual improvement with accumulated interaction history. This work provides the first empirical validation of a computationally tractable, dynamically evolving model of individual values—demonstrating both theoretical validity and practical scalability.
📝 Abstract
As Large Language Models (LLMs) continue to exhibit increasingly human-like capabilities, aligning them with human values has become critically important. Contemporary advanced techniques, such as prompt learning and reinforcement learning, are being deployed to better align LLMs with human values. However, while these approaches address broad ethical considerations and helpfulness, they rarely focus on simulating individualized human value systems. To address this gap, we present ValueSim, a framework that simulates individual values through the generation of personal backstories reflecting past experiences and demographic information. ValueSim converts structured individual data into narrative backstories and employs a multi-module architecture inspired by the Cognitive-Affective Personality System to simulate individual values based on these narratives. Testing ValueSim on a self-constructed benchmark derived from the World Values Survey demonstrates an improvement in top-1 accuracy by over 10% compared to retrieval-augmented generation methods. Further analysis reveals that performance enhances as additional user interaction history becomes available, indicating the model's ability to refine its persona simulation capabilities over time.