MuMTAffect: A Multimodal Multitask Affective Framework for Personality and Emotion Recognition from Physiological Signals

📅 2025-09-04

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This study addresses the challenge of jointly modeling emotion classification and personality re-identification from short-duration physiological signals. We propose a multimodal multi-task affective computing framework grounded in the Constructed Emotion Theory, which decouples core affective representations from higher-level cognitive interpretations. A cross-modal Transformer architecture integrates pupillary response, eye movement, facial action units, and electrodermal activity (EDA); personality prediction is incorporated as an auxiliary task to promote user-specific affective embedding learning. Each modality employs a dedicated Transformer encoder, and cross-modal interactions are modeled via a fusion layer. The framework jointly optimizes valence/arousal classification and personality re-identification. Evaluation on the AFFEC dataset demonstrates that EDA significantly improves arousal recognition, while pupillary and oculomotor features enhance valence discrimination. The model exhibits strong cross-subject generalization and a modular design enables flexible extension.

Technology Category

Application Category

📝 Abstract

We present MuMTAffect, a novel Multimodal Multitask Affective Embedding Network designed for joint emotion classification and personality prediction (re-identification) from short physiological signal segments. MuMTAffect integrates multiple physiological modalities pupil dilation, eye gaze, facial action units, and galvanic skin response using dedicated, transformer-based encoders for each modality and a fusion transformer to model cross-modal interactions. Inspired by the Theory of Constructed Emotion, the architecture explicitly separates core affect encoding (valence/arousal) from higher-level conceptualization, thereby grounding predictions in contemporary affective neuroscience. Personality trait prediction is leveraged as an auxiliary task to generate robust, user-specific affective embeddings, significantly enhancing emotion recognition performance. We evaluate MuMTAffect on the AFFEC dataset, demonstrating that stimulus-level emotional cues (Stim Emo) and galvanic skin response substantially improve arousal classification, while pupil and gaze data enhance valence discrimination. The inherent modularity of MuMTAffect allows effortless integration of additional modalities, ensuring scalability and adaptability. Extensive experiments and ablation studies underscore the efficacy of our multimodal multitask approach in creating personalized, context-aware affective computing systems, highlighting pathways for further advancements in cross-subject generalisation.

Problem

Research questions and friction points this paper is trying to address.

Joint emotion classification and personality prediction from physiological signals

Integrating multimodal physiological data using transformer-based encoders

Enhancing emotion recognition through user-specific affective embeddings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal transformer encoders for physiological signals

Separates core affect encoding from conceptualization

Uses personality prediction as auxiliary task

🔎 Similar Papers

Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: A Systematic Review of Task Formulations and Machine Learning Methods