Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media

πŸ“… 2024-10-18
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the rapid spread of misinformation on social media and poor model robustness caused by high annotation noise, this paper proposes a reliability-aware natural language inference (NLI) modeling paradigm. Methodologically: (1) we design EffiARA, the first annotation framework jointly modeling intra- and inter-group annotator consistency for fine-grained reliability estimation; (2) we dynamically weight training samples by annotator reliability and integrate knowledge-enhanced NLI modeling, soft-label training, and fine-tuning of Llama-3.2-1B and TwHIN-BERT-large. Contributions include: (i) releasing RUC-MCDβ€”the first open Chinese knowledge-intensive conflict detection dataset; (ii) achieving a macro-F1 of 0.757 on RUC-MCD using Llama-3.2-1B, substantially outperforming all baselines; and (iii) empirically validating the effectiveness of reliability-weighted learning under noisy annotation conditions.

Technology Category

Application Category

πŸ“ Abstract
Misinformation spreads rapidly on social media, confusing the truth and targetting potentially vulnerable people. To effectively mitigate the negative impact of misinformation, it must first be accurately detected before applying a mitigation strategy, such as X's community notes, which is currently a manual process. This study takes a knowledge-based approach to misinformation detection, modelling the problem similarly to one of natural language inference. The EffiARA annotation framework is introduced, aiming to utilise inter- and intra-annotator agreement to understand the reliability of each annotator and influence the training of large language models for classification based on annotator reliability. In assessing the EffiARA annotation framework, the Russo-Ukrainian Conflict Knowledge-Based Misinformation Classification Dataset (RUC-MCD) was developed and made publicly available. This study finds that sample weighting using annotator reliability performs the best, utilising both inter- and intra-annotator agreement and soft-label training. The highest classification performance achieved using Llama-3.2-1B was a macro-F1 of 0.757 and 0.740 using TwHIN-BERT-large.
Problem

Research questions and friction points this paper is trying to address.

Social Media
Misinformation
Quality Assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

EffiARA
Fake News Detection
Knowledge-driven Models
πŸ”Ž Similar Papers
No similar papers found.
Owen Cook
Owen Cook
PhD Student in Computer Science, University of Sheffield
natural language processingmachine learningmisinformation detection
C
Charlie Grimshaw
School of Computer Science, The University of Sheffield, Sheffield, UK
B
Ben Wu
School of Computer Science, The University of Sheffield, Sheffield, UK
S
Sophie Dillon
School of Computer Science, The University of Sheffield, Sheffield, UK
J
Jack Hicks
School of Computer Science, The University of Sheffield, Sheffield, UK
Luke Jones
Luke Jones
School of Computer Science, The University of Sheffield, Sheffield, UK
T
Thomas Smith
School of Computer Science, The University of Sheffield, Sheffield, UK
M
Matyas Szert
School of Computer Science, The University of Sheffield, Sheffield, UK
Xingyi Song
Xingyi Song
University of Sheffield
machine learningmachine translationnatural language processing