Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

False information detection models frequently rely on spurious surface-level cues (“shortcuts”), severely undermining their generalization—especially when augmented with large language models (LLMs), which often amplify such biases. To address this, we propose TruthOverTricks, a unified evaluation paradigm that systematically categorizes and diagnoses two distinct shortcut types: intrinsically induced and extrinsically injected. We introduce two novel fact-centric misinformation datasets—NQ-Misinfo and Streaming-Misinfo—to support rigorous robustness assessment. Furthermore, we design SMF, the first LLM-augmented framework for enhancing model robustness, integrating semantic rewriting, factual summarization, and sentiment normalization. Extensive cross-benchmark evaluation across 14 established benchmarks and our new datasets demonstrates that SMF significantly improves robustness on 16 benchmarks, effectively mitigating shortcut reliance. All datasets, code, and models are publicly released to advance trustworthy misinformation detection research.

Technology Category

Application Category

📝 Abstract

Misinformation detection models often rely on superficial cues (i.e., emph{shortcuts}) that correlate with misinformation in training data but fail to generalize to the diverse and evolving nature of real-world misinformation. This issue is exacerbated by large language models (LLMs), which can easily generate convincing misinformation through simple prompts. We introduce TruthOverTricks, a unified evaluation paradigm for measuring shortcut learning in misinformation detection. TruthOverTricks categorizes shortcut behaviors into intrinsic shortcut induction and extrinsic shortcut injection, and evaluates seven representative detectors across 14 popular benchmarks, along with two new factual misinformation datasets, NQ-Misinfo and Streaming-Misinfo. Empirical results reveal that existing detectors suffer severe performance degradation when exposed to both naturally occurring and adversarially crafted shortcuts. To address this, we propose SMF, an LLM-augmented data augmentation framework that mitigates shortcut reliance through paraphrasing, factual summarization, and sentiment normalization. SMF consistently enhances robustness across 16 benchmarks, encouraging models to rely on deeper semantic understanding rather than shortcut cues. To promote the development of misinformation detectors, we have published the resources publicly at https://github.com/whr000001/TruthOverTricks.

Problem

Research questions and friction points this paper is trying to address.

Detecting misinformation using unreliable superficial cues

Addressing performance drop in detectors from adversarial shortcuts

Reducing shortcut reliance via LLM-augmented data augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified evaluation paradigm for shortcut learning

LLM-augmented data augmentation framework SMF

Paraphrasing and factual summarization techniques

🔎 Similar Papers

SoK: Machine Learning for Misinformation Detection