Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media

📅 2024-10-18

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

To address the rapid spread of misinformation on social media and poor model robustness caused by high annotation noise, this paper proposes a reliability-aware natural language inference (NLI) modeling paradigm. Methodologically: (1) we design EffiARA, the first annotation framework jointly modeling intra- and inter-group annotator consistency for fine-grained reliability estimation; (2) we dynamically weight training samples by annotator reliability and integrate knowledge-enhanced NLI modeling, soft-label training, and fine-tuning of Llama-3.2-1B and TwHIN-BERT-large. Contributions include: (i) releasing RUC-MCD—the first open Chinese knowledge-intensive conflict detection dataset; (ii) achieving a macro-F1 of 0.757 on RUC-MCD using Llama-3.2-1B, substantially outperforming all baselines; and (iii) empirically validating the effectiveness of reliability-weighted learning under noisy annotation conditions.

Technology Category

Application Category

📝 Abstract

Misinformation spreads rapidly on social media, confusing the truth and targetting potentially vulnerable people. To effectively mitigate the negative impact of misinformation, it must first be accurately detected before applying a mitigation strategy, such as X's community notes, which is currently a manual process. This study takes a knowledge-based approach to misinformation detection, modelling the problem similarly to one of natural language inference. The EffiARA annotation framework is introduced, aiming to utilise inter- and intra-annotator agreement to understand the reliability of each annotator and influence the training of large language models for classification based on annotator reliability. In assessing the EffiARA annotation framework, the Russo-Ukrainian Conflict Knowledge-Based Misinformation Classification Dataset (RUC-MCD) was developed and made publicly available. This study finds that sample weighting using annotator reliability performs the best, utilising both inter- and intra-annotator agreement and soft-label training. The highest classification performance achieved using Llama-3.2-1B was a macro-F1 of 0.757 and 0.740 using TwHIN-BERT-large.

Problem

Research questions and friction points this paper is trying to address.

Social Media

Misinformation

Quality Assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

EffiARA

Fake News Detection

Knowledge-driven Models

🔎 Similar Papers

No similar papers found.