EEmo-Logic: A Unified Dataset and Multi-Stage Framework for Comprehensive Image-Evoked Emotion Assessment

📅 2026-02-01

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing approaches to image-based emotion understanding are often limited to coarse-grained perception or lack fine-grained reasoning capabilities, making it difficult to capture the multidimensional nature and intensity variations of emotions. To address this gap, this work introduces EEmoDB, the largest-scale dataset for image emotion understanding to date, which uniquely integrates five key dimensions of emotion analysis. Furthermore, we propose EEmo-Logic, a multi-stage, multimodal large language model that leverages instruction tuning and a task-specific Grouped Relative Preference Optimization (GRPO) strategy to significantly enhance fine-grained emotion assessment and question-answering performance. Experimental results demonstrate that EEmo-Logic consistently outperforms existing methods under both in-domain and cross-domain evaluation settings.

Technology Category

Application Category

📝 Abstract

Understanding the multi-dimensional attributes and intensity nuances of image-evoked emotions is pivotal for advancing machine empathy and empowering diverse human-computer interaction applications. However, existing models are still limited to coarse-grained emotion perception or deficient reasoning capabilities. To bridge this gap, we introduce EEmoDB, the largest image-evoked emotion understanding dataset to date. It features $5$ analysis dimensions spanning $5$ distinct task categories, facilitating comprehensive interpretation. Specifically, we compile $1.2M$ question-answering (QA) pairs (EEmoDB-QA) from $125k$ images via automated generation, alongside a $36k$ dataset (EEmoDB-Assess) curated from $25k$ images for fine-grained assessment. Furthermore, we propose EEmo-Logic, an all-in-one multimodal large language model (MLLM) developed via instruction fine-tuning and task-customized group relative preference optimization (GRPO) with novel reward design. Extensive experiments demonstrate that EEmo-Logic achieves robust performance in in-domain and cross-domain datasets, excelling in emotion QA and fine-grained assessment. The code is available at https://anonymous.4open.science/r/EEmoLogic.

Problem

Research questions and friction points this paper is trying to address.

image-evoked emotion

emotion understanding

multi-dimensional attributes

fine-grained assessment

machine empathy

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal large language model

emotion understanding dataset

group relative preference optimization