Multimodal Peer Review Simulation with Actionable To-Do Recommendations for Community-Aware Manuscript Revisions

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current peer review systems—constrained by unimodal (text-only) input, limited contextual grounding, and non-actionable feedback—fail to harness the full potential of large language models (LLMs). This paper introduces the first multimodal (text + figure) peer review simulation system augmented with community knowledge. Leveraging OpenReview data, we design a retrieval-augmented generation (RAG) framework that jointly models visual content and academic semantics to generate high-quality, context-aware review comments. We propose a novel structured format—Action:Objective[#]—to transform feedback into executable, traceable revision tasks. The system is deployed via a web-based interactive interface integrated with academic writing platforms, enabling real-time pre-submission feedback and revision tracking. Experiments demonstrate significant improvements over ablated baselines across comprehensiveness, practicality, and expert agreement, thereby enhancing both review quality and collaborative efficiency.

Technology Category

Application Category

📝 Abstract
While large language models (LLMs) offer promising capabilities for automating academic workflows, existing systems for academic peer review remain constrained by text-only inputs, limited contextual grounding, and a lack of actionable feedback. In this work, we present an interactive web-based system for multimodal, community-aware peer review simulation to enable effective manuscript revisions before paper submission. Our framework integrates textual and visual information through multimodal LLMs, enhances review quality via retrieval-augmented generation (RAG) grounded in web-scale OpenReview data, and converts generated reviews into actionable to-do lists using the proposed Action:Objective[#] format, providing structured and traceable guidance. The system integrates seamlessly into existing academic writing platforms, providing interactive interfaces for real-time feedback and revision tracking. Experimental results highlight the effectiveness of the proposed system in generating more comprehensive and useful reviews aligned with expert standards, surpassing ablated baselines and advancing transparent, human-centered scholarly assistance.
Problem

Research questions and friction points this paper is trying to address.

Existing peer review systems lack multimodal inputs and contextual grounding
Current academic feedback fails to provide structured actionable recommendations
Limited integration of visual and textual data for manuscript revision guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal LLMs integrate text and visual data
RAG enhances reviews using OpenReview data
Action:Objective format creates actionable to-do lists
🔎 Similar Papers
No similar papers found.