🤖 AI Summary
This work addresses the lack of quantitative evaluation standards for “explanatory actionability” in interpretable Automated Fact-Checking (AFC). Methodologically, it introduces the first fine-grained, web-augmented, human-judgment-calibrated framework for assessing explanation actionability. It formally defines actionability for AFC explanations, establishes a multi-dimensional evaluation taxonomy, and constructs a dedicated benchmark dataset. The framework integrates real-time web retrieval, structured scoring, human annotation consistency modeling, and Pearson/Kendall correlation analysis. Experiments demonstrate that the proposed framework achieves significantly higher correlation with human judgments than all state-of-the-art evaluators—attaining the highest Pearson and Kendall coefficients—while exhibiting the lowest egocentric bias. These results validate its robustness and practical utility for evaluating actionable explanations in AFC.
📝 Abstract
The field of explainable Automatic Fact-Checking (AFC) aims to enhance the transparency and trustworthiness of automated fact-verification systems by providing clear and comprehensible explanations. However, the effectiveness of these explanations depends on their actionability --their ability to empower users to make informed decisions and mitigate misinformation. Despite actionability being a critical property of high-quality explanations, no prior research has proposed a dedicated method to evaluate it. This paper introduces FinGrAct, a fine-grained evaluation framework that can access the web, and it is designed to assess actionability in AFC explanations through well-defined criteria and an evaluation dataset. FinGrAct surpasses state-of-the-art (SOTA) evaluators, achieving the highest Pearson and Kendall correlation with human judgments while demonstrating the lowest ego-centric bias, making it a more robust evaluation approach for actionability evaluation in AFC.