🤖 AI Summary
This study addresses the lack of large-scale empirical evidence on the machine learning community’s attitudes toward, and perceptions of risks and benefits associated with, open peer review—specifically, the public release of review reports and open commentary.
Method: We conducted a survey of 2,385 ICLR and NeurIPS reviewers, authors, and area chairs; performed qualitative thematic analysis; and applied AI-driven text annotation to quantitatively compare review quality across the two conferences’ traditional versus open-review practices.
Contribution/Results: Over 80% of respondents support publishing reviews for accepted papers; 41% identified resubmission bias as the primary risk. AI-based evaluation revealed that fully open reviews significantly outperform conventional reviews in correctness and completeness. This work pioneers the integration of large-scale empirical survey data with AI-assisted, quantitative assessment of review quality—providing actionable evidence and concrete implementation pathways for open science policy development.
📝 Abstract
Peer-review venues have increasingly adopted open reviewing policies that publicly release anonymized reviews and permit public commenting. Venues have adopted a variety of policies, and there is still ongoing debate about the benefits and drawbacks of decisions. To inform this debate, we surveyed 2,385 reviewers, authors, and other peer-review participants in machine learning to understand their experiences and opinions. Our key findings are:
(a) Preferences: Over 80% of respondents support releasing reviews for accepted papers and allowing public comments. However, only 27.1% support releasing rejected manuscripts.
(b) Benefits: Respondents cite improved public understanding (75.3%) and reviewer education (57.8%), increased fairness (56.6%), and stronger incentives for high-quality reviews (48.0%).
(c) Challenges: The top concern is resubmission bias, where rejection history biases future reviewers (ranked top impact of open reviewing by 41% of respondents, and mentioned in over 50% of free responses). Other challenges include fear of reviewer de-anonymization (33.2%) and potential commenting abuse.
(d) AI and open peer review: Participants believe open policies deter "AI slop" submissions (71.9%) and AI-generated reviews (38.9%). Respondents are split regarding peer-review venues generating official AI reviews, with 56.0% opposed and 44.0% supportive.
Finally, we use AI to annotate 4,244 reviews from ICLR (fully open) and NeurIPS (partially open). We find that the fully open venue (ICLR) has higher levels of correctness and completeness than the partially open venue (NeurIPS). The effect size is small for correctness and very small for completeness, and both are statistically significant. We also find that there is no statistically significant difference in the level of substantiation. We release the full dataset at https://github.com/justinpayan/OpenReviewAnalysis.