MSM-BD: Multimodal Social Media Bot Detection Using Heterogeneous Information

๐Ÿ“… 2024-12-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the covert misuse of highly realistic, generative-AI-driven social bots, this paper proposes a robust multimodal detection framework. Methodologically, it jointly models textual, visual, and user-statistical features, introducing the Cross-Modal Residual Cross-Attention (CMRCA) mechanismโ€”the first to enable fine-grained, calibratable feature alignment and complementary enhancement across heterogeneous modalities. The framework integrates dedicated multimodal encoders with a graph neural network to explicitly capture user relational structures. Evaluated on the TwiBot-22 benchmark, it achieves an F1-score of 92.7%, surpassing the state-of-the-art by 3.2 percentage points and demonstrating significantly improved detection capability against novel, evasive bot variants. The core contributions are the novel design of the CMRCA mechanism and its first successful application to social bot detection, establishing a new paradigm for robust, multimodal adversarial bot identification.

Technology Category

Application Category

๐Ÿ“ Abstract
Although social bots can be engineered for constructive applications, their potential for misuse in manipulative schemes and malware distribution cannot be overlooked. This dichotomy underscores the critical need to detect social bots on social media platforms. Advances in artificial intelligence have improved the abilities of social bots, allowing them to generate content that is almost indistinguishable from human-created content. These advancements require the development of more advanced detection techniques to accurately identify these automated entities. Given the heterogeneous information landscape on social media, spanning images, texts, and user statistical features, we propose MSM-BD, a Multimodal Social Media Bot Detection approach using heterogeneous information. MSM-BD incorporates specialized encoders for heterogeneous information and introduces a cross-modal fusion technology, Cross-Modal Residual Cross-Attention (CMRCA), to enhance detection accuracy. We validate the effectiveness of our model through extensive experiments using the TwiBot-22 dataset.
Problem

Research questions and friction points this paper is trying to address.

Social Media
Bot Detection
Disinformation Prevention
Innovation

Methods, ideas, or system contributions that make the work stand out.

MSM-BD
Multi-modal Information
CMRCA Technology
๐Ÿ”Ž Similar Papers
No similar papers found.
Tingxuan Wu
Tingxuan Wu
London School of Economics and Political Science
Computer Vision
Z
Zhaorui Ma
Department of Computer Science, George Mason University, USA
Yanjun Cui
Yanjun Cui
Dartmouth College
Z
Ziyi Zhou
Department of Computer Science, Dartmouth College, USA
E
Eric Wang
Independent Researcher, USA