Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Prior LLM evaluations largely ignore user heterogeneity, assuming uniform preferences across individuals. This study investigates whether personality traits systematically influence users’ preferences for large language models (LLMs) in multi-turn human-AI collaboration. Method: Grounded in the Keirsey Temperament Sorter, we conducted multi-round interactive experiments across four task domains—data analysis, creative writing, information retrieval, and writing assistance—comparing GPT-4 and Claude 3.5. Preferences were quantified via helpfulness ratings and enriched with qualitative feedback analyzed through sentiment-aware thematic coding. Contribution/Results: Although overall helpfulness scores showed no statistically significant difference between models, personality type strongly predicted preference: Rational temperament users significantly favored GPT-4, whereas Idealist users preferred Claude 3.5. This is the first empirical demonstration of stable, temperament-based LLM preference patterns. The findings challenge the “one-size-fits-all” evaluation paradigm and establish a user-centered, human factors–informed foundation for personalized LLM adaptation and psychometrically grounded evaluation frameworks.

Technology Category

Application Category

📝 Abstract

As Large Language Models (LLMs) increasingly integrate into everyday workflows, where users shape outcomes through multi-turn collaboration, a critical question emerges: do users with different personality traits systematically prefer certain LLMs over others? We conducted a study with 32 participants evenly distributed across four Keirsey personality types, evaluating their interactions with GPT-4 and Claude 3.5 across four collaborative tasks: data analysis, creative writing, information retrieval, and writing assistance. Results revealed significant personality-driven preferences: Rationals strongly preferred GPT-4, particularly for goal-oriented tasks, while idealists favored Claude 3.5, especially for creative and analytical tasks. Other personality types showed task-dependent preferences. Sentiment analysis of qualitative feedback confirmed these patterns. Notably, aggregate helpfulness ratings were similar across models, showing how personality-based analysis reveals LLM differences that traditional evaluations miss.

Problem

Research questions and friction points this paper is trying to address.

Personality traits predict LLM preferences in collaboration

Study examines how different personalities prefer GPT-4 vs Claude

Personality-based analysis reveals differences traditional evaluations miss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed personality-based LLM preferences using Keirsey types

Compared GPT-4 and Claude 3.5 across four collaborative tasks

Used sentiment analysis to confirm personality-driven preference patterns

🔎 Similar Papers

No similar papers found.

Authors to Follow