🤖 AI Summary
Large language models (LLMs) can generate malicious message injections that pose a novel adversarial threat to message-propagation-tree (MPT)-based rumor detection systems. Method: This paper proposes a contrastive learning–based robustness enhancement framework that challenges the conventional “influential-node” assumption by uniformly modeling node-level predictive influence as a contrastive objective—thereby encouraging graph neural networks to leverage information from all nodes more equitably. The method jointly integrates MPT structural modeling, adversarial robust training, and influence-equalization constraints. Contribution/Results: Evaluated on Twitter and Weibo datasets, the approach maintains high clean accuracy while significantly improving adversarial robustness: LLM-driven injection attack success rates drop by over 40% on average. It establishes a transferable, robust learning paradigm for rumor detection.
📝 Abstract
In the era of rapidly evolving large language models (LLMs), state-of-the-art rumor detection systems, particularly those based on Message Propagation Trees (MPTs), which represent a conversation tree with the post as its root and the replies as its descendants, are facing increasing threats from adversarial attacks that leverage LLMs to generate and inject malicious messages. Existing methods are based on the assumption that different nodes exhibit varying degrees of influence on predictions. They define nodes with high predictive influence as important nodes and target them for attacks. If the model treats nodes' predictive influence more uniformly, attackers will find it harder to target high predictive influence nodes. In this paper, we propose Similarizing the predictive Influence of Nodes with Contrastive Learning (SINCon), a defense mechanism that encourages the model to learn graph representations where nodes with varying importance have a more uniform influence on predictions. Extensive experiments on the Twitter and Weibo datasets demonstrate that SINCon not only preserves high classification accuracy on clean data but also significantly enhances resistance against LLM-driven message injection attacks.