Impact of Stickers on Multimodal Chat Sentiment Analysis and Intent Recognition: A New Task, Dataset and Baseline

📅 2024-05-14
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing sentiment analysis and intent recognition research largely overlooks stickers—a critical multimodal modality—lacking both a dedicated task formulation and high-quality Chinese multimodal datasets. Method: We propose the Sticker-Augmented Multimodal Chat Sentiment and Intent Recognition task (MSAIRS) and introduce the first open-source Chinese dialogue-sticker dataset featuring fine-grained causal contrastive design (i.e., same text/different stickers; same sticker/different texts). We formally characterize sticker influence mechanisms on semantic understanding and empirically demonstrate the pivotal role of sticker visual content in sentiment/intent discrimination. To this end, we design MMSAIR, a multimodal fusion model integrating cross-modal alignment, vision-language joint encoding, and contrastive learning. Results: Incorporating sticker visual features improves F1 by 12.3% over unimodal baselines, establishing new state-of-the-art performance. The dataset, code, and models are publicly released.

Technology Category

Application Category

📝 Abstract
Stickers are increasingly used in social media to express sentiment and intent. When finding typing troublesome, people often use a sticker instead. Despite the significant impact of stickers on sentiment analysis and intent recognition, little research has been conducted. To address this gap, we propose a new task: Multimodal chat Sentiment Analysis and Intent Recognition involving Stickers (MSAIRS). Additionally, we introduce a novel multimodal dataset containing Chinese chat records and stickers excerpted from several mainstream social media platforms. Our dataset includes paired data with the same text but different stickers, and various stickers consisting of the same images with different texts, allowing us to better understand the impact of stickers on chat sentiment and intent. We also propose an effective multimodal joint model, MMSAIR, for our task, which is validated on our datasets and indicates that visual information of stickers counts. Our dataset and code will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Analyze sentiment and intent using stickers in social media
Create a multimodal dataset for sticker impact analysis
Develop a model for joint sentiment and intent recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces MSAIRS task for sticker sentiment and intent
Proposes MMSAIR model with differential vector construction
Uses cascaded attention for enhanced multimodal fusion
🔎 Similar Papers
Y
Yuanchen Shi
School of Computer Science and Technology, Soochow University
B
Biao Ma
School of Computer Science and Technology, Soochow University
Fang Kong
Fang Kong
Southern University of Science and Technology, Assistant Professor
multi-armed banditsonline learningreinforcement learning