🤖 AI Summary
This paper addresses the problem of ambiguous intrinsic value representation in large language models (LLMs) and the lack of grounding in empirical psychological theory. To this end, we propose the first empirically grounded psychological framework for LLM value modeling. Methodologically, we introduce Generative Psychological Lexical Analysis (GPLA)—a novel approach that constructs a five-factor psychological value system tailored to LLMs, integrating psychological lexicon development, generative prompt engineering, multi-task benchmark evaluation, and quantitative value-consistency analysis. Compared to Schwartz’s classical value theory, our system passes standard psychometric validity tests and demonstrates significant improvements in three dimensions: accuracy of value representation, predictive capability for safety risks, and efficacy in human–AI value alignment. The framework establishes a verifiable, interpretable, and scalable psychological foundation for value modeling in LLMs.
📝 Abstract
Values are core drivers of individual and collective perception, cognition, and behavior. Value systems, such as Schwartz's Theory of Basic Human Values, delineate the hierarchy and interplay among these values, enabling cross-disciplinary investigations into decision-making and societal dynamics. Recently, the rise of Large Language Models (LLMs) has raised concerns regarding their elusive intrinsic values. Despite growing efforts in evaluating, understanding, and aligning LLM values, a psychologically grounded LLM value system remains underexplored. This study addresses the gap by introducing the Generative Psycho-Lexical Approach (GPLA), a scalable, adaptable, and theoretically informed method for constructing value systems. Leveraging GPLA, we propose a psychologically grounded five-factor value system tailored for LLMs. For systematic validation, we present three benchmarking tasks that integrate psychological principles with cutting-edge AI priorities. Our results reveal that the proposed value system meets standard psychological criteria, better captures LLM values, improves LLM safety prediction, and enhances LLM alignment, when compared to the canonical Schwartz's values.