🤖 AI Summary
To address the challenges of detecting and pixel-level localizing covert hateful content—particularly “confusing memes”—in multimodal memes, this paper proposes an end-to-end contrastive learning framework. Methodologically, it integrates contrastive learning, generative data augmentation, and lightweight joint training. Key contributions include: (1) a novel contrastive meme generator that automatically constructs semantically complementary positive and negative sample pairs; (2) a customized triplet dataset explicitly designed for hate speech identification; and (3) an image–text alignment module that produces context-aware multimodal embeddings. Evaluated on the Hateful Meme Dataset, the model achieves a 12.3% improvement in F1-score for hate detection while using significantly fewer parameters than mainstream large models. Moreover, it is the first to support fine-grained spatial localization of hateful regions, demonstrating the feasibility and interpretability of identifying implicit hateful content.
📝 Abstract
Amidst the rise of Large Multimodal Models (LMMs) and their widespread application in generating and interpreting complex content, the risk of propagating biased and harmful memes remains significant. Current safety measures often fail to detect subtly integrated hateful content within ``Confounder Memes''. To address this, we introduce extsc{HateSieve}, a new framework designed to enhance the detection and segmentation of hateful elements in memes. extsc{HateSieve} features a novel Contrastive Meme Generator that creates semantically paired memes, a customized triplet dataset for contrastive learning, and an Image-Text Alignment module that produces context-aware embeddings for accurate meme segmentation. Empirical experiments on the Hateful Meme Dataset show that extsc{HateSieve} not only surpasses existing LMMs in performance with fewer trainable parameters but also offers a robust mechanism for precisely identifying and isolating hateful content. extcolor{red}{Caution: Contains academic discussions of hate speech; viewer discretion advised.}