See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of detecting, explaining, and preemptively mitigating hateful memes in data-scarce scenarios by proposing the first unified few-shot multimodal agent framework. The framework integrates task-specific generative multimodal agents with few-shot-adapted large multimodal models to simultaneously achieve hate content detection, semantic interpretation, and pre-publication intervention under extremely limited labeled data. By unifying these three critical tasks within a single generalizable and low-resource-friendly architecture, this study significantly enhances the efficiency, interpretability, and practical deployability of content moderation systems.

Technology Category

Application Category

📝 Abstract
In this work, we examine hateful memes from three complementary angles - how to detect them, how to explain their content and how to intervene them prior to being posted - by applying a range of strategies built on top of generative AI models. To the best of our knowledge, explanation and intervention have typically been studied separately from detection, which does not reflect real-world conditions. Further, since curating large annotated datasets for meme moderation is prohibitively expensive, we propose a novel framework that leverages task-specific generative multimodal agents and the few-shot adaptability of large multimodal models to cater to different types of memes. We believe this is the first work focused on generalizable hateful meme moderation under limited data conditions, and has strong potential for deployment in real-world production scenarios. Warning: Contains potentially toxic contents.
Problem

Research questions and friction points this paper is trying to address.

hateful meme moderation
few-shot learning
multimodal AI
content intervention
explainable AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

few-shot learning
multimodal agent
hateful meme moderation
generative AI
content intervention
🔎 Similar Papers
No similar papers found.
N
Naquee Rizwan
Indian Institute of Technology (IIT), Kharagpur
S
Subhankar Swain
Indian Institute of Technology (IIT), Kharagpur
P
Paramananda Bhaskar
Indian Institute of Technology (IIT), Kharagpur
G
Gagan Aryan
Simbian
S
Shehryaar Shah Khan
Indian Institute of Technology (IIT), Kharagpur
Animesh Mukherjee
Animesh Mukherjee
Professor of Computer Science, IIT Kharagpur, FNAE, Distinguished Member, ACM
Language dynamicsComplex systems and networksweb social media