Identifying social bots via heterogeneous motifs based on Naïve Bayes model

📅 2025-12-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing social bot detection methods inadequately model neighborhood preference heterogeneity and lack theoretical foundations for motif-based discrimination. Method: This paper proposes a Naive Bayes–based heterogeneous motif theory framework: (i) node labeling refines motif structures to explicitly capture heterogeneous neighbor preferences; (ii) it theoretically quantifies the maximum discriminative capacity of heterogeneous motifs for the first time; and (iii) leverages this capacity to design a capability-driven Top-K motif selection mechanism, drastically reducing computational overhead without sacrificing performance. The approach integrates heterogeneous network motif mining, probabilistic modeling, and pairwise node contribution analysis. Results: Evaluated on four public datasets, the method outperforms state-of-the-art approaches across five metrics. Notably, using only ~10% of high-capacity motifs achieves over 98% of the detection performance attained by the full motif set.

Technology Category

Application Category

📝 Abstract
Identifying social bots has become a critical challenge due to their significant influence on social media ecosystems. Despite advancements in detection methods, most topology-based approaches insufficiently account for the heterogeneity of neighborhood preferences and lack a systematic theoretical foundation, relying instead on intuition and experience. Here, we propose a theoretical framework for detecting social bots utilizing heterogeneous motifs based on the Naïve Bayes model. Specifically, we refine homogeneous motifs into heterogeneous ones by incorporating node-label information, effectively capturing the heterogeneity of neighborhood preferences. Additionally, we systematically evaluate the contribution of different node pairs within heterogeneous motifs to the likelihood of a node being identified as a social bot. Furthermore, we mathematically quantify the maximum capability of each heterogeneous motif, enabling the estimation of its potential benefits. Comprehensive evaluations on four large, publicly available benchmarks confirm that our method surpasses state-of-the-art techniques, achieving superior performance across five evaluation metrics. Moreover, our results reveal that selecting motifs with the highest capability achieves detection performance comparable to using all heterogeneous motifs. Overall, our framework offers an effective and theoretically grounded solution for social bot detection, significantly enhancing cybersecurity measures in social networks.
Problem

Research questions and friction points this paper is trying to address.

Detect social bots using heterogeneous motifs and Naive Bayes model
Address heterogeneity in neighborhood preferences for bot detection
Systematically evaluate motif contributions and quantify maximum detection capability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using heterogeneous motifs with Naive Bayes model
Refining motifs with node-label information for heterogeneity
Mathematically quantifying motif capability for optimal selection
Yijun Ran
Yijun Ran
Beijing Normal University
Network ScienceComplex SystemsSocial Bots
Jingjing Xiao
Jingjing Xiao
Director, Bio-Med Informatics Research Center, Xinqiao Hospital, Army Medical University, China
Computer VisionMedical Image Processing
X
Xiao-Ke Xu
Center for Computational Communication Research, Beijing Normal University, Zhuhai 519087, People’s Republic of China, and School of Journalism and Communication, Beijing Normal University, Beijing 100875, People’s Republic of China