Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional personality assessment struggles to model stable traits expressed across modalities, asynchronously, and at subconscious levels. To address this, we propose a psychology-guided multimodal personality modeling framework. First, we design personality-specific prompts to steer large language models (LLMs) in extracting fine-grained personality semantics from textual inputs. Second, we introduce a text-centric feature fusion network comprising chunk-wise projection, cross-modal connectors, and a text-enhancement module to achieve semantic alignment and integration of linguistic, facial, and behavioral signals. Evaluated on the AVI validation set, our method reduces mean squared error (MSE) by 45% over prior approaches and secures first place in the AVI Challenge 2025 Personality Assessment Track. The framework significantly improves both accuracy and robustness in personality trait estimation, demonstrating strong generalization across heterogeneous, temporally misaligned, and implicitly encoded behavioral cues.

Technology Category

Application Category

📝 Abstract
Accurate and reliable personality assessment plays a vital role in many fields, such as emotional intelligence, mental health diagnostics, and personalized education. Unlike fleeting emotions, personality traits are stable, often subconsciously leaked through language, facial expressions, and body behaviors, with asynchronous patterns across modalities. It was hard to model personality semantics with traditional superficial features and seemed impossible to achieve effective cross-modal understanding. To address these challenges, we propose a novel personality assessment framework called extit{ extbf{Traits Run Deep}}. It employs extit{ extbf{psychology-informed prompts}} to elicit high-level personality-relevant semantic representations. Besides, it devises a extit{ extbf{Text-Centric Trait Fusion Network}} that anchors rich text semantics to align and integrate asynchronous signals from other modalities. To be specific, such fusion module includes a Chunk-Wise Projector to decrease dimensionality, a Cross-Modal Connector and a Text Feature Enhancer for effective modality fusion and an ensemble regression head to improve generalization in data-scarce situations. To our knowledge, we are the first to apply personality-specific prompts to guide large language models (LLMs) in extracting personality-aware semantics for improved representation quality. Furthermore, extracting and fusing audio-visual apparent behavior features further improves the accuracy. Experimental results on the AVI validation set have demonstrated the effectiveness of the proposed components, i.e., approximately a 45% reduction in mean squared error (MSE). Final evaluations on the test set of the AVI Challenge 2025 confirm our method's superiority, ranking first in the Personality Assessment track. The source code will be made available at https://github.com/MSA-LMC/TraitsRunDeep.
Problem

Research questions and friction points this paper is trying to address.

Enhancing personality assessment via psychology-guided LLM representations
Modeling asynchronous personality traits across multimodal behaviors
Improving cross-modal understanding for accurate trait fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Psychology-informed prompts for LLM representations
Text-Centric Trait Fusion Network integration
Cross-modal alignment with audio-visual features
🔎 Similar Papers
No similar papers found.
J
Jia Li
Hefei University of Technology, Hefei, China
Y
Yichao He
Hefei University of Technology, Hefei, China
Jiacheng Xu
Jiacheng Xu
Nanyang Technological University
Reinforcement LearningLarge Language Model
T
Tianhao Luo
Hefei University of Technology, Hefei, China
Zhenzhen Hu
Zhenzhen Hu
Hefei University of Technology
Multimedia
Richang Hong
Richang Hong
Hefei University of Technology
MultimediaPattern Recognition
M
Meng Wang
Hefei University of Technology, Hefei, China