🤖 AI Summary
To address emotion recognition and expression challenges faced by neurodiverse individuals—particularly those with autism spectrum disorder (ASD)—this study proposes an interpretable, personalized, real-time multimodal affect estimation framework. The method integrates physiological signals (EEG, ECG, BVP, GSR/EDA) with behavioral modalities (facial expressions, speech) and models affect dynamically in the two-dimensional arousal–valence space. Key contributions include: (1) a cross-modal unified representation mechanism supporting both naturalistic (passive video viewing) and interactive (semi-structured dialogue) scenarios; and (2) an individual adaptation module coupled with neuroadaptive feedback, enabling affective education and inclusive human–machine interaction. Experimental results demonstrate significant improvements in real-time affect estimation accuracy and inter-subject specificity. This work establishes a novel paradigm for affective computing tailored to neurodiverse users, advancing both theoretical understanding and practical deployment in assistive and educational technologies.
📝 Abstract
Many individuals especially those with autism spectrum disorder (ASD), alexithymia, or other neurodivergent profiles face challenges in recognizing, expressing, or interpreting emotions. To support more inclusive and personalized emotion technologies, we present a real-time multimodal emotion estimation system that combines neurophysiological EEG, ECG, blood volume pulse (BVP), and galvanic skin response (GSR/EDA) and behavioral modalities (facial expressions, and speech) in a unified arousal-valence 2D interface to track moment-to-moment emotional states. This architecture enables interpretable, user-specific analysis and supports applications in emotion education, neuroadaptive feedback, and interaction support for neurodiverse users. Two demonstration scenarios illustrate its application: (1) passive media viewing (2D or VR videos) reveals cortical and autonomic responses to affective content, and (2) semi-scripted conversations with a facilitator or virtual agent capture real-time facial and vocal expressions. These tasks enable controlled and naturalistic emotion monitoring, making the system well-suited for personalized feedback and neurodiversity-informed interaction design.