All That Glisters Is Not Gold: A Benchmark for Reference-Free Counterfactual Financial Misinformation Detection

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the challenge that current large language models struggle to reliably detect counterfactual misinformation in financial news without reference texts, due to their limited capacity for coherent reasoning over dispersed contextual cues. To this end, we introduce RFC Bench, the first reference-free benchmark tailored for counterfactual misinformation detection in the financial domain, comprising paragraph-level authentic news articles and featuring two tasks: reference-free misinformation detection and pairwise original-perturbed news contrastive diagnosis. Leveraging context-aware evaluation metrics, we systematically assess model stability and output validity in the absence of external references. Experimental results reveal that while model performance improves significantly with contrastive context, it suffers from pervasive belief instability and invalid outputs under reference-free conditions, exposing a fundamental limitation in real-world reliability.

Technology Category

Application Category

📝 Abstract

We introduce RFC Bench, a benchmark for evaluating large language models on financial misinformation under realistic news. RFC Bench operates at the paragraph level and captures the contextual complexity of financial news where meaning emerges from dispersed cues. The benchmark defines two complementary tasks: reference free misinformation detection and comparison based diagnosis using paired original perturbed inputs. Experiments reveal a consistent pattern: performance is substantially stronger when comparative context is available, while reference free settings expose significant weaknesses, including unstable predictions and elevated invalid outputs. These results indicate that current models struggle to maintain coherent belief states without external grounding. By highlighting this gap, RFC Bench provides a structured testbed for studying reference free reasoning and advancing more reliable financial misinformation detection in real world settings.

Problem

Research questions and friction points this paper is trying to address.

financial misinformation

reference-free detection

counterfactual reasoning

large language models

belief consistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

reference-free evaluation

financial misinformation detection

counterfactual reasoning