MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection

📅 2025-07-09

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Detecting harmful content in rapidly evolving social media memes remains challenging due to scarce annotated data. To address this, we propose a zero-shot cross-modal detection framework that requires no labeled examples. Our method introduces three key innovations: (1) cross-modal retrieval leveraging vision-language models to identify semantically similar memes from an unlabeled reference set; (2) a bidirectional insight generation mechanism that jointly models image-text inconsistency and contextually implied malice; and (3) a multi-agent debate-and-negotiation module that fuses heterogeneous model inferences to enhance decision reliability. Evaluated on three mainstream meme datasets, our approach significantly outperforms existing zero-shot baselines—achieving an average 12.6% F1-score improvement—while demonstrating strong generalization capability and lightweight deployability.

Technology Category

Application Category

📝 Abstract

The rapid expansion of memes on social media has highlighted the urgent need for effective approaches to detect harmful content. However, traditional data-driven approaches struggle to detect new memes due to their evolving nature and the lack of up-to-date annotated data. To address this issue, we propose MIND, a multi-agent framework for zero-shot harmful meme detection that does not rely on annotated data. MIND implements three key strategies: 1) We retrieve similar memes from an unannotated reference set to provide contextual information. 2) We propose a bi-directional insight derivation mechanism to extract a comprehensive understanding of similar memes. 3) We then employ a multi-agent debate mechanism to ensure robust decision-making through reasoned arbitration. Extensive experiments on three meme datasets demonstrate that our proposed framework not only outperforms existing zero-shot approaches but also shows strong generalization across different model architectures and parameter scales, providing a scalable solution for harmful meme detection. The code is available at https://github.com/destroy-lonely/MIND.

Problem

Research questions and friction points this paper is trying to address.

Detect harmful memes without annotated data

Address evolving meme content on social media

Improve zero-shot detection and generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieves similar memes for contextual information

Uses bi-directional insight derivation mechanism

Employs multi-agent debate for decision-making

🔎 Similar Papers

Exploring the Limits of Zero Shot Vision Language Models for Hate Meme Detection: The Vulnerabilities and their Interpretations