Measuring How LLMs Internalize Human Psychological Concepts: A preliminary analysis

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This study addresses whether large language models (LLMs) accurately internalize human psychological constructs—a fundamental question in computational psychology and AI alignment. Method: We propose a quantitative psychological alignment framework grounded in 43 standardized psychometric instruments. Using pairwise semantic similarity computation, hierarchical clustering, and item-reconstruction classification tasks, we systematically evaluate the capacity of GPT-4, GPT-3.5, and BERT to reconstruct human psychological structure. Contribution/Results: GPT-4 achieves significantly higher fidelity in recovering human-derived dimensional clusters (66.2% classification accuracy), outperforming GPT-3.5 (55.9%) and BERT (48.1%). Its semantic representations exhibit significant positive correlation with human response patterns. Critically, the analysis uncovers systematic representational biases across specific constructs—e.g., overgeneralization of trait labels or under-differentiation of affective states. This work establishes the first interpretable, quantifiable paradigm for assessing psychological alignment in LLMs, advancing cognitive modeling and responsible deployment of foundation models in behavioral science applications.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) such as ChatGPT have shown remarkable abilities in producing human-like text. However, it is unclear how accurately these models internalize concepts that shape human thought and behavior. Here, we developed a quantitative framework to assess concept alignment between LLMs and human psychological dimensions using 43 standardized psychological questionnaires, selected for their established validity in measuring distinct psychological constructs. Our method evaluates how accurately language models reconstruct and classify questionnaire items through pairwise similarity analysis. We compared resulting cluster structures with the original categorical labels using hierarchical clustering. A GPT-4 model achieved superior classification accuracy (66.2%), significantly outperforming GPT-3.5 (55.9%) and BERT (48.1%), all exceeding random baseline performance (31.9%). We also demonstrated that the estimated semantic similarity from GPT-4 is associated with Pearson's correlation coefficients of human responses in multiple psychological questionnaires. This framework provides a novel approach to evaluate the alignment of the human-LLM concept and identify potential representational biases. Our findings demonstrate that modern LLMs can approximate human psychological constructs with measurable accuracy, offering insights for developing more interpretable AI systems.

Problem

Research questions and friction points this paper is trying to address.

Assess LLM alignment with human psychological concepts

Quantify LLM accuracy in reconstructing questionnaire items

Identify representational biases in LLM concept internalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantitative framework for concept alignment

Pairwise similarity analysis for classification

Hierarchical clustering compares human-LLM structures

🔎 Similar Papers

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models