MER-Bench: A Comprehensive Benchmark for Multimodal Meme Reappraisal

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the "Meme Reappraisal" task, which aims to controllably transform the negative sentiment of internet memes into positive, constructive multimodal content while preserving their visual structure and semantic context. To facilitate research in this direction, we introduce MER-Bench, the first multimodal benchmark dataset specifically designed for emotion transformation with structural preservation, featuring fine-grained annotations for both affective states and compositional elements. We further present a generation and evaluation framework grounded in multimodal large language models (MLLMs). By adopting an MLLM-as-a-Judge paradigm for multidimensional automatic assessment, our experiments reveal significant limitations in current approaches regarding structural fidelity, semantic consistency, and affective transformation efficacy, thereby laying the groundwork for controllable meme editing and emotion-aware multimodal generation.

Technology Category

Application Category

📝 Abstract
Memes represent a tightly coupled, multimodal form of social expression, in which visual context and overlaid text jointly convey nuanced affect and commentary. Inspired by cognitive reappraisal in psychology, we introduce Meme Reappraisal, a novel multimodal generation task that aims to transform negatively framed memes into constructive ones while preserving their underlying scenario, entities, and structural layout. Unlike prior works on meme understanding or generation, Meme Reappraisal requires emotion-controllable, structure-preserving multimodal transformation under multiple semantic and stylistic constraints. To support this task, we construct MER-Bench, a benchmark of real-world memes with fine-grained multimodal annotations, including source and target emotions, positively rewritten meme text, visual editing specifications, and taxonomy labels covering visual type, sentiment polarity, and layout structure. We further propose a structured evaluation framework based on a multimodal large language model (MLLM)-as-a-Judge paradigm, decomposing performance into modality-level generation quality, affect controllability, structural fidelity, and global affective alignment. Extensive experiments across representative image-editing and multimodal-generation systems reveal substantial gaps in satisfying the constraints of structural preservation, semantic consistency, and affective transformation. We believe MER-Bench establishes a foundation for research on controllable meme editing and emotion-aware multimodal generation. Our code is available at: https://github.com/one-seven17/MER-Bench.
Problem

Research questions and friction points this paper is trying to address.

Meme Reappraisal
Multimodal Generation
Emotion Transformation
Structural Preservation
Affective Alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Meme Reappraisal
Multimodal Generation
Emotion-controllable Editing
Structure-preserving Transformation
MER-Bench
🔎 Similar Papers
No similar papers found.
Y
Yiqi Nie
School of Artificial Intelligence, Anhui University, Hefei, 230039, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230026, China
Fei Wang
Fei Wang
Hefei University of Technology
Motion MagnificationMLLMAffective Computing
Junjie Chen
Junjie Chen
Harbin Institute of Technology, Shenzhen
BioinformaticsNatural Language ProcessingArtificial Intelligence
K
Kun Li
College of Information Technology, United Arab Emirates University, Al Ain, Abu Dhabi, United Arab Emirates
Y
Yudi Cai
Institute of Advanced Technology, University of Science and Technology of China, Hefei, 230031, China
Dan Guo
Dan Guo
IEEE senior member, Professor, Hefei University of Technology
Multimedia ComputingArtificial Intelligence
Chenglong Li
Chenglong Li
Professor, The University of Florida
Drug DesignDrug DiscoveryMolecular RecognitionMolecular ModelingProtein structure and Dynamics
M
Meng Wang
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230601, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230026, China