Affective Image Editing: Shaping Emotional Factors via Text Descriptions

📅 2025-05-24

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Existing text-driven image editing methods struggle to accurately model and control emotional attributes in images, lacking semantic understanding and continuous representation of users’ affective intent. To address this, we propose the first emotion-instruction-guided image editing framework: (1) a Continuous Emotion Spectrum (CES) to model the emotion space; (2) a learnable Emotional Mapper that enables end-to-end mapping from visual–abstract emotion requests to semantic representations; and (3) MLLM-supervised training coupled with semantics-guided visual element deformation. To support this, we introduce EmoAlign—the first large-scale emotion-aligned image-text dataset. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art approaches across key metrics, including emotion fidelity, instruction adherence, and editing quality, enabling precise and diverse responses to textual emotion instructions.

Technology Category

Application Category

📝 Abstract

In daily life, images as common affective stimuli have widespread applications. Despite significant progress in text-driven image editing, there is limited work focusing on understanding users' emotional requests. In this paper, we introduce AIEdiT for Affective Image Editing using Text descriptions, which evokes specific emotions by adaptively shaping multiple emotional factors across the entire images. To represent universal emotional priors, we build the continuous emotional spectrum and extract nuanced emotional requests. To manipulate emotional factors, we design the emotional mapper to translate visually-abstract emotional requests to visually-concrete semantic representations. To ensure that editing results evoke specific emotions, we introduce an MLLM to supervise the model training. During inference, we strategically distort visual elements and subsequently shape corresponding emotional factors to edit images according to users' instructions. Additionally, we introduce a large-scale dataset that includes the emotion-aligned text and image pair set for training and evaluation. Extensive experiments demonstrate that AIEdiT achieves superior performance, effectively reflecting users' emotional requests.

Problem

Research questions and friction points this paper is trying to address.

Editing images to evoke specific emotions via text descriptions

Translating abstract emotional requests into visual representations

Creating emotion-aligned datasets for training and evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Build continuous emotional spectrum for nuanced requests

Design emotional mapper for abstract-to-concrete translation

Use MLLM supervision for emotion-specific editing

🔎 Similar Papers

EmoEdit: Evoking Emotions through Image Manipulation