ReviewRL: Towards Automated Scientific Review with RL

πŸ“… 2025-08-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing automated peer review systems suffer from significant limitations in factual accuracy, scoring consistency, and analytical depth, yielding generic, insight-poor feedback. To address these issues, we propose the first reinforcement learning framework integrating literature retrieval augmentation (ArXiv-MCP) with a multi-dimensional composite reward mechanism, jointly optimizing review quality and scoring accuracy. Our approach combines supervised fine-tuning using external scientific literature and high-quality human reviews. Evaluated on the ICLR 2025 dataset, our model substantially outperforms baselines: generated reviews exhibit stronger factual grounding and critical reasoning, while scores better align with expert consensus. Both human evaluation and automated metrics confirm marked improvements in depth, coherence, and trustworthiness. This work establishes a novel paradigm for developing reliable, interpretable, and scientifically grounded intelligent peer review systems.

Technology Category

Application Category

πŸ“ Abstract
Peer review is essential for scientific progress but faces growing challenges due to increasing submission volumes and reviewer fatigue. Existing automated review approaches struggle with factual accuracy, rating consistency, and analytical depth, often generating superficial or generic feedback lacking the insights characteristic of high-quality human reviews. We introduce ReviewRL, a reinforcement learning framework for generating comprehensive and factually grounded scientific paper reviews. Our approach combines: (1) an ArXiv-MCP retrieval-augmented context generation pipeline that incorporates relevant scientific literature, (2) supervised fine-tuning that establishes foundational reviewing capabilities, and (3) a reinforcement learning procedure with a composite reward function that jointly enhances review quality and rating accuracy. Experiments on ICLR 2025 papers demonstrate that ReviewRL significantly outperforms existing methods across both rule-based metrics and model-based quality assessments. ReviewRL establishes a foundational framework for RL-driven automatic critique generation in scientific discovery, demonstrating promising potential for future development in this domain. The implementation of ReviewRL will be released at GitHub.
Problem

Research questions and friction points this paper is trying to address.

Addressing reviewer fatigue and high submission volumes in peer review
Improving factual accuracy and depth in automated review feedback
Enhancing rating consistency and analytical quality in scientific critiques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-augmented context generation pipeline
Supervised fine-tuning for reviewing capabilities
Reinforcement learning with composite reward function
πŸ”Ž Similar Papers
No similar papers found.
Sihang Zeng
Sihang Zeng
University of Washington
Biomedical InformaticsMachine Learning for Healthcare
K
Kai Tian
Tsinghua University
Kaiyan Zhang
Kaiyan Zhang
Tsinghua University
Foundation ModelCollective IntelligenceScientific Intelligence
Y
Yuru Wang
Tsinghua University
Junqi Gao
Junqi Gao
Shanghai AI Lab, ε“ˆε°”ζ»¨ε·₯业倧学
Deep LearningGenerative ModelsContinual Learning
R
Runze Liu
Tsinghua University, Shanghai AI Laboratory
S
Sa Yang
Peking University
J
Jingxuan Li
Harbin Engineering University
Xinwei Long
Xinwei Long
Tsinghua University
natural language processingmulti-modal learning
J
Jiaheng Ma
Beijing Institute of Technology
B
Biqing Qi
Shanghai AI Laboratory
B
Bowen Zhou
Tsinghua University, Shanghai AI Laboratory