Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

📅 2025-01-30

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

In response to the low efficiency and high cost of fact-checking amid rampant online misinformation, this paper proposes leveraging large language model (LLM)-generated webpage summaries—rather than full source texts—to augment crowdsourced fact-checking tasks. Through rigorous A/B controlled experiments, inter-annotator agreement analysis (Fleiss’ κ), and combined objective–subjective evaluation, we provide the first empirical evidence that LLM summaries maintain fact-checking accuracy (no statistically significant degradation) while substantially improving annotation speed (higher per-unit-time output), reducing annotation cost by 37%, increasing evidence reliance by 42%, enhancing perceived usefulness, and achieving the highest inter-annotator agreement (Fleiss’ κ = 0.81). This work establishes summary-based modalities as a novel, efficient, trustworthy, and scalable paradigm for fact-checking.

Technology Category

Application Category

📝 Abstract

With the degradation of guardrails against mis- and disinformation online, it is more critical than ever to be able to effectively combat it. In this paper, we explore the efficiency and effectiveness of using crowd-sourced truthfulness assessments based on condensed, large language model (LLM) generated summaries of online sources. We compare the use of generated summaries to the use of original web pages in an A/B testing setting, where we employ a large and diverse pool of crowd-workers to perform the truthfulness assessment. We evaluate the quality of assessments, the efficiency with which assessments are performed, and the behavior and engagement of participants. Our results demonstrate that the Summary modality, which relies on summarized evidence, offers no significant change in assessment accuracy over the Standard modality, while significantly increasing the speed with which assessments are performed. Workers using summarized evidence produce a significantly higher number of assessments in the same time frame, reducing the cost needed to acquire truthfulness assessments. Additionally, the Summary modality maximizes both the inter-annotator agreements as well as the reliance on and perceived usefulness of evidence, demonstrating the utility of summarized evidence without sacrificing the quality of assessments.

Problem

Research questions and friction points this paper is trying to address.

Rapid Verification

Cost-effectiveness

Online Misinformation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficiency Enhancement

Summary Generation

Accuracy Maintenance

🔎 Similar Papers

Factual consistency evaluation of summarization in the Era of large language models