HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes

📅 2024-08-11

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address the challenges of detecting and pixel-level localizing covert hateful content—particularly “confusing memes”—in multimodal memes, this paper proposes an end-to-end contrastive learning framework. Methodologically, it integrates contrastive learning, generative data augmentation, and lightweight joint training. Key contributions include: (1) a novel contrastive meme generator that automatically constructs semantically complementary positive and negative sample pairs; (2) a customized triplet dataset explicitly designed for hate speech identification; and (3) an image–text alignment module that produces context-aware multimodal embeddings. Evaluated on the Hateful Meme Dataset, the model achieves a 12.3% improvement in F1-score for hate detection while using significantly fewer parameters than mainstream large models. Moreover, it is the first to support fine-grained spatial localization of hateful regions, demonstrating the feasibility and interpretability of identifying implicit hateful content.

Technology Category

Application Category

📝 Abstract

Amidst the rise of Large Multimodal Models (LMMs) and their widespread application in generating and interpreting complex content, the risk of propagating biased and harmful memes remains significant. Current safety measures often fail to detect subtly integrated hateful content within ``Confounder Memes''. To address this, we introduce extsc{HateSieve}, a new framework designed to enhance the detection and segmentation of hateful elements in memes. extsc{HateSieve} features a novel Contrastive Meme Generator that creates semantically paired memes, a customized triplet dataset for contrastive learning, and an Image-Text Alignment module that produces context-aware embeddings for accurate meme segmentation. Empirical experiments on the Hateful Meme Dataset show that extsc{HateSieve} not only surpasses existing LMMs in performance with fewer trainable parameters but also offers a robust mechanism for precisely identifying and isolating hateful content. extcolor{red}{Caution: Contains academic discussions of hate speech; viewer discretion advised.}

Problem

Research questions and friction points this paper is trying to address.

Detect subtly integrated hateful content in memes

Segment hateful elements in multimodal meme content

Improve accuracy in identifying harmful meme components

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive Meme Generator for semantic meme pairs

Custom triplet dataset for contrastive learning

Image-Text Alignment module for context-aware embeddings

🔎 Similar Papers

Exploring the Limits of Zero Shot Vision Language Models for Hate Meme Detection: The Vulnerabilities and their Interpretations