Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks

📅 2025-08-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Prior LLM evaluations largely ignore user heterogeneity, assuming uniform preferences across individuals. This study investigates whether personality traits systematically influence users’ preferences for large language models (LLMs) in multi-turn human-AI collaboration. Method: Grounded in the Keirsey Temperament Sorter, we conducted multi-round interactive experiments across four task domains—data analysis, creative writing, information retrieval, and writing assistance—comparing GPT-4 and Claude 3.5. Preferences were quantified via helpfulness ratings and enriched with qualitative feedback analyzed through sentiment-aware thematic coding. Contribution/Results: Although overall helpfulness scores showed no statistically significant difference between models, personality type strongly predicted preference: Rational temperament users significantly favored GPT-4, whereas Idealist users preferred Claude 3.5. This is the first empirical demonstration of stable, temperament-based LLM preference patterns. The findings challenge the “one-size-fits-all” evaluation paradigm and establish a user-centered, human factors–informed foundation for personalized LLM adaptation and psychometrically grounded evaluation frameworks.

Technology Category

Application Category

📝 Abstract
As Large Language Models (LLMs) increasingly integrate into everyday workflows, where users shape outcomes through multi-turn collaboration, a critical question emerges: do users with different personality traits systematically prefer certain LLMs over others? We conducted a study with 32 participants evenly distributed across four Keirsey personality types, evaluating their interactions with GPT-4 and Claude 3.5 across four collaborative tasks: data analysis, creative writing, information retrieval, and writing assistance. Results revealed significant personality-driven preferences: Rationals strongly preferred GPT-4, particularly for goal-oriented tasks, while idealists favored Claude 3.5, especially for creative and analytical tasks. Other personality types showed task-dependent preferences. Sentiment analysis of qualitative feedback confirmed these patterns. Notably, aggregate helpfulness ratings were similar across models, showing how personality-based analysis reveals LLM differences that traditional evaluations miss.
Problem

Research questions and friction points this paper is trying to address.

Personality traits predict LLM preferences in collaboration
Study examines how different personalities prefer GPT-4 vs Claude
Personality-based analysis reveals differences traditional evaluations miss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed personality-based LLM preferences using Keirsey types
Compared GPT-4 and Claude 3.5 across four collaborative tasks
Used sentiment analysis to confirm personality-driven preference patterns
🔎 Similar Papers
No similar papers found.
S
Sarfaroz Yunusov
Brock University, St. Catharines, Canada
K
Kaige Chen
Brock University, St. Catharines, Canada
K
Kazi Nishat Anwar
Brock University, St. Catharines, Canada
Ali Emami
Ali Emami
Assistant Professor, Emory University
Natural Language ProcessingArtificial IntelligenceComputational Social ScienceAI & Ethics