EmoMind: Decoding Affective Captions from Human Brain fMRI

๐Ÿ“… 2026-05-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

206K/year
๐Ÿค– AI Summary
This study addresses the limitation of existing brain-to-text systems that largely overlook affective content and rely on discrete emotion labels, thereby failing to capture individualized emotional nuances. To overcome this, the authors propose EmoMindโ€”the first end-to-end framework that directly generates continuous, personalized emotional captions from fMRI signals. EmoMind first decodes a neutral semantic description and then integrates a 34-dimensional continuous emotion vector derived from the same fMRI data. Using classifier-free guidance, it rewrites the neutral caption into an emotionally enriched version, while an identity-preserving null branch enables smooth interpolation between semantic and affective representations. Evaluated on two independent fMRI datasets, EmoMind significantly outperforms GPT-4โ€“based discrete-label baselines in individual specificity, emotional structure geometry, and causal controllability, with the largest gains observed on metrics dependent on individualized affective structures.
๐Ÿ“ Abstract
Decoding visual experience from brain activity has advanced substantially, but cur- rent brain-to-text systems largely recover semantic content while discarding affect. Additionally, language models can generate emotional text when prompted with categorical labels, but such labels collapse rich inter-subject variability into coarse discrete bins. We present EmoMind, the first end-to-end pipeline for decoding affective captions directly from fMRI signals. EmoMind first retrieves a semanti- cally grounded neutral scene description from brain-decoded visual features, then rewrites it using a continuous 34-dimensional emotion vector decoded from the same fMRI recording. To control the balance between content preservation and affective expression, we train the rewriter with classifier-free guidance against an identity-preserving null branch, enabling smooth interpolation between semantic fidelity and affective expressivity. We evaluate affective caption generation with a three-axis validation framework spanning subject-specificity, structural geometry, and causal control. We further augment this framework with a synthetic-brain substitution test that probes robustness to the measurement apparatus, and we benchmark each axis against GPT-4 prompted with brain-decoded top-5 emotion labels as a strong discrete baseline. Across two independent emotion fMRI datasets, EmoMind significantly outperforms label-prompted GPT-4 on all three axes, with the largest gains on metrics that require person-specific affective structure rather than population-level emotion aggregation. These results establish continuous brain-decoded affect as a viable control signal for individualized affective cap- tion generation and open new directions for studying individual affective brain organisation.
Problem

Research questions and friction points this paper is trying to address.

affective captioning
brain decoding
fMRI
emotion representation
individual variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

affective captioning
fMRI decoding
continuous emotion representation
classifier-free guidance
individualized brain decoding
๐Ÿ”Ž Similar Papers
No similar papers found.