PeerArg: Argumentative Peer Review with LLMs

📅 2024-09-25

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses two critical challenges in scientific paper peer review: (1) the subjectivity and bias inherent in human reviews, and (2) the lack of interpretability and trustworthiness in existing NLP-based review support systems due to their opaque “black-box” nature. To tackle these issues, we propose PeerArg—a novel, explainable review support system that pioneers structured argumentation modeling for review synthesis and decision prediction. PeerArg integrates large language models (LLMs), domain-specific knowledge graphs, argument mining techniques, and rule-guided reasoning chains to produce interpretable, evidence-backed aggregations of multiple review reports and to predict acceptance decisions. Through few-shot adaptation, PeerArg significantly outperforms end-to-end few-shot LLM baselines across three benchmark datasets. Results demonstrate that explicit argumentative modeling not only improves predictive accuracy but also enhances decision transparency—thereby overcoming a key barrier to the trustworthy deployment of AI-assisted peer review systems.

Technology Category

Application Category

📝 Abstract

Peer review is an essential process to determine the quality of papers submitted to scientific conferences or journals. However, it is subjective and prone to biases. Several studies have been conducted to apply techniques from NLP to support peer review, but they are based on black-box techniques and their outputs are difficult to interpret and trust. In this paper, we propose a novel pipeline to support and understand the reviewing and decision-making processes of peer review: the PeerArg system combining LLMs with methods from knowledge representation. PeerArg takes in input a set of reviews for a paper and outputs the paper acceptance prediction. We evaluate the performance of the PeerArg pipeline on three different datasets, in comparison with a novel end-2-end LLM that uses few-shot learning to predict paper acceptance given reviews. The results indicate that the end-2-end LLM is capable of predicting paper acceptance from reviews, but a variant of the PeerArg pipeline outperforms this LLM.

Problem

Research questions and friction points this paper is trying to address.

Improve peer review objectivity using LLMs

Enhance interpretability of review decision processes

Predict paper acceptance with advanced NLP techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs combined with knowledge representation

Predicts paper acceptance from reviews

Outperforms end-to-end LLM in evaluation

🔎 Similar Papers

Argumentative Large Language Models for Explainable and Contestable Decision-Making