WarrantScore: Modeling Warrants between Claims and Evidence for Substantiation Evaluation in Peer Reviews

📅 2026-01-24

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This study addresses the critical shortage of human reviewers in peer review and the limitations of existing automated evaluation methods, which struggle to accurately assess the logical support between claims and evidence in review comments. To overcome this, the work proposes an interpretable automatic evaluation framework that explicitly models the “warrant”—the inferential link connecting claims and evidence—going beyond conventional approaches that merely detect the presence of evidence. Leveraging language models to extract claims and supporting evidence, the method introduces a novel quantitative metric, WarrantScore, to measure the strength of reasoning. Experimental results demonstrate that this approach achieves significantly higher correlation with human judgments than current state-of-the-art methods, thereby enhancing both the accuracy and efficiency of automated support for peer review.

Technology Category

Application Category

📝 Abstract

The scientific peer-review process is facing a shortage of human resources due to the rapid growth in the number of submitted papers. The use of language models to reduce the human cost of peer review has been actively explored as a potential solution to this challenge. A method has been proposed to evaluate the level of substantiation in scientific reviews in a manner that is interpretable by humans. This method extracts the core components of an argument, claims and evidence, and assesses the level of substantiation based on the proportion of claims supported by evidence. The level of substantiation refers to the extent to which claims are based on objective facts. However, when assessing the level of substantiation, simply detecting the presence or absence of supporting evidence for a claim is insufficient; it is also necessary to accurately assess the logical inference between a claim and its evidence. We propose a new evaluation metric for scientific review comments that assesses the logical inference between claims and evidence. Experimental results show that the proposed method achieves a higher correlation with human scores than conventional methods, indicating its potential to better support the efficiency of the peer-review process.

Problem

Research questions and friction points this paper is trying to address.

peer review

substantiation evaluation

claim-evidence relationship

logical inference

argumentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

WarrantScore

substantiation evaluation

claim-evidence reasoning