Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales

📅 2024-09-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods struggle to scale psychological assessment of latent constructs—such as personality, emotion, and bias—in large language models (LLMs), hindering transparency and controllability. To address this, we introduce, for the first time, classical psychometric principles into LLM analysis, proposing a Natural Language Inference (NLI)-based framework for reconstructing psychological scales. This framework operationalizes clinical constructs (e.g., anxiety, depression) into generalizable, prompt-based assessments, enabling cross-model psychological evaluation across 88 open-source LLMs. Our approach integrates statistical modeling and correlation analysis to validate high consistency between model responses and established human psychological theories. It uncovers interpretable bias patterns and supports theory-grounded model calibration. We publicly release an open-source evaluation toolkit, establishing a novel, psychometrically principled paradigm for trustworthy LLM assessment.

Technology Category

Application Category

📝 Abstract
Human-like personality traits have recently been discovered in large language models, raising the hypothesis that their (known and as yet undiscovered) biases conform with human latent psychological constructs. While large conversational models may be tricked into answering psychometric questionnaires, the latent psychological constructs of thousands of simpler transformers, trained for other tasks, cannot be assessed because appropriate psychometric methods are currently lacking. Here, we show how standard psychological questionnaires can be reformulated into natural language inference prompts, and we provide a code library to support the psychometric assessment of arbitrary models. We demonstrate, using a sample of 88 publicly available models, the existence of human-like mental health-related constructs (including anxiety, depression, and Sense of Coherence) which conform with standard theories in human psychology and show similar correlations and mitigation strategies. The ability to interpret and rectify the performance of language models by using psychological tools can boost the development of more explainable, controllable, and trustworthy models.
Problem

Research questions and friction points this paper is trying to address.

Language Models
Psychological Traits
Model Transparency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Psychological Assessment
Language Models
Personalization and Mental States
🔎 Similar Papers
No similar papers found.
M
Maor Reuben
Ben-Gurion University of the Negev
O
Ortal Slobodin
Ben-Gurion University of the Negev
Aviad Elyashar
Aviad Elyashar
Department of Computer Science, Shamoon College of Engineering (SCE)
Social NetworksData MiningInformation SecurityMachine Learning
I
Idan Cohen
Ben-Gurion University of the Negev
O
O. Braun-Lewensohn
Ben-Gurion University of the Negev
O
Odeya Cohen
Ben-Gurion University of the Negev
Rami Puzis
Rami Puzis
Software and Information Systems Engineering Department, Ben-Gurion University of the Negev
complex networkssocial networksdeep learningcyber securitycyberbiosecurity