AtomEval: Atomic Evaluation of Adversarial Claims in Fact Verification

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current adversarial evaluation metrics for fact-checking systems struggle to detect semantically distorted yet surface-similar rewrites, often misjudging the validity of adversarial examples. To address this limitation, this work proposes AtomEval, a novel framework that introduces, for the first time, an SROM (Subject–Relation–Object–Modifier) atomic decomposition–based validity-aware evaluation mechanism. AtomEval employs Atomic Validity Scoring (AVS) to assess factual consistency in adversarial rewrites beyond superficial similarity. Experiments on the FEVER dataset demonstrate that AtomEval yields more reliable evaluation signals and reveals that stronger large language models do not necessarily generate adversarial claims with higher factual fidelity.
📝 Abstract
Adversarial claim rewriting is widely used to test fact-checking systems, but standard metrics fail to capture truth-conditional consistency and often label semantically corrupted rewrites as successful. We introduce AtomEval, a validity-aware evaluation framework that decomposes claims into subject-relation-object-modifier (SROM) atoms and scores adversarial rewrites with Atomic Validity Scoring (AVS), enabling detection of factual corruption beyond surface similarity. Experiments on the FEVER dataset across representative attack strategies and LLM generators show that AtomEval provides more reliable evaluation signals in our experiments. Using AtomEval, we further analyze LLM-based adversarial generators and observe that stronger models do not necessarily produce more effective adversarial claims under validity-aware evaluation, highlighting previously overlooked limitations in current adversarial evaluation practices.
Problem

Research questions and friction points this paper is trying to address.

adversarial claims
fact verification
truth-conditional consistency
evaluation metrics
semantic corruption
Innovation

Methods, ideas, or system contributions that make the work stand out.

AtomEval
adversarial claim rewriting
fact verification
atomic validity scoring
SROM decomposition
🔎 Similar Papers
No similar papers found.