🤖 AI Summary
Existing sentiment analysis and intent recognition research largely overlooks stickers—a critical multimodal modality—lacking both a dedicated task formulation and high-quality Chinese multimodal datasets. Method: We propose the Sticker-Augmented Multimodal Chat Sentiment and Intent Recognition task (MSAIRS) and introduce the first open-source Chinese dialogue-sticker dataset featuring fine-grained causal contrastive design (i.e., same text/different stickers; same sticker/different texts). We formally characterize sticker influence mechanisms on semantic understanding and empirically demonstrate the pivotal role of sticker visual content in sentiment/intent discrimination. To this end, we design MMSAIR, a multimodal fusion model integrating cross-modal alignment, vision-language joint encoding, and contrastive learning. Results: Incorporating sticker visual features improves F1 by 12.3% over unimodal baselines, establishing new state-of-the-art performance. The dataset, code, and models are publicly released.
📝 Abstract
Stickers are increasingly used in social media to express sentiment and intent. When finding typing troublesome, people often use a sticker instead. Despite the significant impact of stickers on sentiment analysis and intent recognition, little research has been conducted. To address this gap, we propose a new task: Multimodal chat Sentiment Analysis and Intent Recognition involving Stickers (MSAIRS). Additionally, we introduce a novel multimodal dataset containing Chinese chat records and stickers excerpted from several mainstream social media platforms. Our dataset includes paired data with the same text but different stickers, and various stickers consisting of the same images with different texts, allowing us to better understand the impact of stickers on chat sentiment and intent. We also propose an effective multimodal joint model, MMSAIR, for our task, which is validated on our datasets and indicates that visual information of stickers counts. Our dataset and code will be publicly available.