🤖 AI Summary
To address the challenges of scarce target-domain labels, pronounced graph heterophily, and poor robustness in single-source domain adaptation for social bot detection, this paper proposes BotTrans, a multi-source graph domain adaptation framework. Methodologically, BotTrans integrates graph neural networks (GNNs), attention-weighted neighborhood aggregation, and multi-source domain adaptation techniques. Its key contributions are: (1) a novel cross-source homogenization topology construction mechanism to mitigate heterophily-induced interference; (2) a source–target correlation-aware weighted multi-source knowledge transfer strategy; and (3) a target-domain semantic consistency regularization module coupled with result refinement. Evaluated on multiple real-world social network datasets, BotTrans achieves an average 5.2% improvement in F1-score over state-of-the-art methods. Notably, it maintains high performance even under extremely low-resource settings, demonstrating superior generalization and robustness.
📝 Abstract
Transferring extensive knowledge from relevant social networks has emerged as a promising solution to overcome label scarcity in detecting social bots and other anomalies with GNN-based models. However, effective transfer faces two critical challenges. Firstly, the network heterophily problem, which is caused by bots hiding malicious behaviors via indiscriminately interacting with human users, hinders the model's ability to learn sufficient and accurate bot-related knowledge from source domains. Secondly, single-source transfer might lead to inferior and unstable results, as the source network may embody weak relevance to the task and provide limited knowledge. To address these challenges, we explore multiple source domains and propose a multi-source graph domain adaptation model named extit{BotTrans}. We initially leverage the labeling knowledge shared across multiple source networks to establish a cross-source-domain topology with increased network homophily. We then aggregate cross-domain neighbor information to enhance the discriminability of source node embeddings. Subsequently, we integrate the relevance between each source-target pair with model optimization, which facilitates knowledge transfer from source networks that are more relevant to the detection task. Additionally, we propose a refinement strategy to improve detection performance by utilizing semantic knowledge within the target domain. Extensive experiments on real-world datasets demonstrate that extit{BotTrans} outperforms the existing state-of-the-art methods, revealing its efficacy in leveraging multi-source knowledge when the target detection task is unlabeled.