🤖 AI Summary
This study addresses the challenges of fact-checking climate-related claims, which stem from the high scientific expertise required and the diverse rhetorical strategies employed by climate misinformation. It introduces, for the first time, a narrative classification task for climate misinformation, accompanied by a large-scale annotated dataset. The work proposes a novel evaluation framework for retrieval quality under incomplete annotation conditions. Methodologically, it integrates dense retrieval, an ensemble of cross-encoders, large language models, and structured hierarchical reasoning. The associated shared task attracted 20 participating teams, with 8 systems submitted. Findings reveal significant differences in verifiability across misinformation types, demonstrate systematic biases in conventional evaluation metrics, and confirm that not all climate misinformation is equally amenable to verification.
📝 Abstract
Automatically verifying climate-related claims against scientific literature is a challenging task, complicated by the specialised nature of scholarly evidence and the diversity of rhetorical strategies underlying climate disinformation. ClimateCheck 2026 is the second iteration of a shared task addressing this challenge, expanding on the 2025 edition with tripled training data and a new disinformation narrative classification task. Running from January to February 2026 on the CodaBench platform, the competition attracted 20 registered participants and 8 leaderboard submissions, with systems combining dense retrieval pipelines, cross-encoder ensembles, and large language models with structured hierarchical reasoning. In addition to standard evaluation metrics (Recall@K and Binary Preference), we adapt an automated framework to assess retrieval quality under incomplete annotations, exposing systematic biases in how conventional metrics rank systems. A cross-task analysis further reveals that not all climate disinformation is equally verifiable, potentially implicating how future fact-checking systems should be designed.