Adversarial Attacks Against Automated Fact-Checking: A Survey

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In the context of rampant misinformation, automated fact-checking (AFC) systems are increasingly vulnerable to adversarial attacks, undermining their reliability. This paper systematically classifies adversarial attack vectors targeting AFC into three categories: claim manipulation, evidence perturbation, and claim-evidence pair tampering—the first such taxonomy in the literature. We propose an integrated analytical framework comprising attack generation, evidence-level perturbation, and robustness evaluation, grounded in a comprehensive survey and multi-technique empirical validation—including natural language inference (NLI)-based attacks and multimodal adversarial example generation—to identify critical model vulnerabilities. Concurrently, we synthesize state-of-the-art defense strategies and pinpoint persistent challenges, notably robust training and trustworthy evidence provenance. Our work provides both theoretical foundations and actionable methodologies for enhancing AFC system resilience, thereby advancing the development of trustworthy information verification infrastructures.

Technology Category

Application Category

📝 Abstract
In an era where misinformation spreads freely, fact-checking (FC) plays a crucial role in verifying claims and promoting reliable information. While automated fact-checking (AFC) has advanced significantly, existing systems remain vulnerable to adversarial attacks that manipulate or generate claims, evidence, or claim-evidence pairs. These attacks can distort the truth, mislead decision-makers, and ultimately undermine the reliability of FC models. Despite growing research interest in adversarial attacks against AFC systems, a comprehensive, holistic overview of key challenges remains lacking. These challenges include understanding attack strategies, assessing the resilience of current models, and identifying ways to enhance robustness. This survey provides the first in-depth review of adversarial attacks targeting FC, categorizing existing attack methodologies and evaluating their impact on AFC systems. Additionally, we examine recent advancements in adversary-aware defenses and highlight open research questions that require further exploration. Our findings underscore the urgent need for resilient FC frameworks capable of withstanding adversarial manipulations in pursuit of preserving high verification accuracy.
Problem

Research questions and friction points this paper is trying to address.

Adversarial attacks manipulate automated fact-checking systems
Existing systems lack resilience against claim and evidence manipulation
Survey categorizes attack methodologies and evaluates their impacts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Categorizing adversarial attack methodologies against fact-checking systems
Evaluating impact of attacks on automated fact-checking model performance
Examining adversary-aware defense mechanisms for enhanced robustness
🔎 Similar Papers
No similar papers found.
F
Fanzhen Liu
School of Computing, Macquarie University, Australia
A
Alsharif Abuadbba
CSIRO’s Data61, Australia
Kristen Moore
Kristen Moore
Team Lead - CSIRO's Data61
AI SecurityAI SafetyAI for Cyber Security
Surya Nepal
Surya Nepal
CSIRO’s Data61, Australia
cyber securitydata privacydistributed systems
Cecile Paris
Cecile Paris
Chief Research Scientist, CSIRO Data61, Australia
Natural Language ProcessingUser ModellingSocial MediaLanguageHuman Computer Interaction
J
Jia Wu
School of Computing, Macquarie University, Australia
J
Jian Yang
School of Computing, Macquarie University, Australia
Q
Quan Z. Sheng
School of Computing, Macquarie University, Australia