Learning from Mistakes: Enhancing Harmful Meme Detection via Misjudgment Risk Patterns

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing harmful meme detection methods struggle to identify rhetorically driven implicit harms—such as irony and metaphor—leading to high false-negative and false-positive rates. To address this, we propose PatMD, a novel framework that constructs a retrievable knowledge base of misclassification-risk patterns and leverages pattern retrieval coupled with dynamic reasoning to guide multimodal large language models (MLLMs) away from superficial content matching toward structured, pattern-aware risk identification. PatMD enables fine-grained misclassification prevention: on a benchmark of 6,626 memes, it improves average F1-score by 8.30% and accuracy by 7.71%, consistently outperforming state-of-the-art methods across five distinct harmful content detection tasks. Our core contribution is the first formalization of misclassification risk as explicit, retrievable, and reasoning-guidable knowledge—significantly enhancing MLLMs’ robustness in detecting covert harmful semantics.

Technology Category

Application Category

📝 Abstract
Internet memes have emerged as a popular multimodal medium, yet they are increasingly weaponized to convey harmful opinions through subtle rhetorical devices like irony and metaphor. Existing detection approaches, including MLLM-based techniques, struggle with these implicit expressions, leading to frequent misjudgments. This paper introduces PatMD, a novel approach that improves harmful meme detection by learning from and proactively mitigating these potential misjudgment risks. Our core idea is to move beyond superficial content-level matching and instead identify the underlying misjudgment risk patterns, proactively guiding the MLLMs to avoid known misjudgment pitfalls. We first construct a knowledge base where each meme is deconstructed into a misjudgment risk pattern explaining why it might be misjudged, either overlooking harmful undertones (false negative) or overinterpreting benign content (false positive). For a given target meme, PatMD retrieves relevant patterns and utilizes them to dynamically guide the MLLM's reasoning. Experiments on a benchmark of 6,626 memes across 5 harmful detection tasks show that PatMD outperforms state-of-the-art baselines, achieving an average of 8.30% improvement in F1-score and 7.71% improvement in accuracy, demonstrating strong generalizability and improved detection capability of harmful memes.
Problem

Research questions and friction points this paper is trying to address.

Detecting harmful memes using implicit rhetorical devices
Addressing misjudgment risks in multimodal meme detection
Improving detection accuracy for ironic and metaphorical memes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning from misjudgment risk patterns
Proactively guiding MLLMs to avoid pitfalls
Dynamic reasoning using retrieved risk patterns
🔎 Similar Papers
No similar papers found.
Wenshuo Wang
Wenshuo Wang
Professor, Beijing Institute of Technology (BIT) | Research Fellow, UC Berkeley, CMU, McGill
Human-Robot InteractionAutonomous DrivingBayesian LearningHuman Factors
Ziyou Jiang
Ziyou Jiang
Institute of Software Chinese Academy of Sciences
software engineering
J
Junjie Wang
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
M
Mingyang Li
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
J
Jie Huang
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
Y
Yuekai Huang
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
Zhiyuan Chang
Zhiyuan Chang
Institute of Software Chinese Academy of Science
LLM SecurityMultimodal TestingRequirements Engineering
F
Feiyan Duan
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
Q
Qing Wang
State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; Science and Technology on Integrated Information System Laboratory, Institute of Software Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China