Group-Adaptive Adversarial Learning for Robust Fake News Detection Against Malicious Comments

๐Ÿ“… 2025-10-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Current fake news detection (FND) models exhibit insufficient robustness against adversarial commentsโ€”either crafted by malicious human users or generated by large language models (LLMs). To address this, we propose a group-adaptive adversarial training framework. Methodologically, we first establish a psychology-driven adversarial comment classification system grounded in perceptual, cognitive, and social dimensions. We then introduce a Dirichlet-distribution-based dynamic sampling mechanism to enable cross-category adaptive learning. Furthermore, we integrate LLM-generated diverse adversarial examples with an InfoDirichlet category-aware optimization strategy. Evaluated on multiple benchmark datasets, our model maintains high detection accuracy while significantly improving resilience against heterogeneous adversarial perturbations. Empirical results demonstrate superior robustness compared to state-of-the-art FND approaches.

Technology Category

Application Category

๐Ÿ“ Abstract
The spread of fake news online distorts public judgment and erodes trust in social media platforms. Although recent fake news detection (FND) models perform well in standard settings, they remain vulnerable to adversarial comments-authored by real users or by large language models (LLMs)-that subtly shift model decisions. In view of this, we first present a comprehensive evaluation of comment attacks to existing fake news detectors and then introduce a group-adaptive adversarial training strategy to improve the robustness of FND models. To be specific, our approach comprises three steps: (1) dividing adversarial comments into three psychologically grounded categories: perceptual, cognitive, and societal; (2) generating diverse, category-specific attacks via LLMs to enhance adversarial training; and (3) applying a Dirichlet-based adaptive sampling mechanism (InfoDirichlet Adjusting Mechanism) that dynamically adjusts the learning focus across different comment categories during training. Experiments on benchmark datasets show that our method maintains strong detection accuracy while substantially increasing robustness to a wide range of adversarial comment perturbations.
Problem

Research questions and friction points this paper is trying to address.

Detecting fake news vulnerable to adversarial comment attacks
Improving model robustness against malicious comment perturbations
Developing adaptive training for diverse adversarial comment categories
Innovation

Methods, ideas, or system contributions that make the work stand out.

Group-adaptive adversarial training strategy for robustness
Three psychologically grounded comment categories for classification
Dirichlet-based adaptive sampling mechanism for dynamic learning
Zhao Tong
Zhao Tong
Inria Sophia Antipolis
Geometry Processing
C
Chunlin Gong
University of Minnesota Twin Cities, Minneapolis, United States
Y
Yimeng Gu
Queen Mary University of London, London, United Kingdom
Haichao Shi
Haichao Shi
Institute of Information Engineering,Chinese Academy of Sciences
Q
Qiang Liu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
S
Shu Wu
Institute of Automation, Chinese Academy of Sciences, Beijing, China
X
Xiao-Yu Zhang
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China