Is Peer Review Really in Decline? Analyzing Review Quality across Venues and Time

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses widespread concerns in the academic community regarding declining peer review quality by proposing the first evaluation framework capable of enabling cross-conference and longitudinal comparisons of review quality. The framework introduces a method to characterize the diversity of review formats, employs standardized preprocessing pipelines, and integrates large language model–assisted scoring with lightweight heuristic metrics to construct a multidimensional quantitative assessment system. Empirical analysis applied to major AI/ML conferences—including ICLR, NeurIPS, and *ACL—reveals that median review quality does not exhibit a consistent downward trend, thereby challenging the prevailing narrative of “review quality decay.” The study further establishes a reproducible methodological foundation for future research on peer review mechanisms.

Technology Category

Application Category

📝 Abstract
Peer review is at the heart of modern science. As submission numbers rise and research communities grow, the decline in review quality is a popular narrative and a common concern. Yet, is it true? Review quality is difficult to measure, and the ongoing evolution of reviewing practices makes it hard to compare reviews across venues and time. To address this, we introduce a new framework for evidence-based comparative study of review quality and apply it to major AI and machine learning conferences: ICLR, NeurIPS and *ACL. We document the diversity of review formats and introduce a new approach to review standardization. We propose a multi-dimensional schema for quantifying review quality as utility to editors and authors, coupled with both LLM-based and lightweight measurements. We study the relationships between measurements of review quality, and its evolution over time. Contradicting the popular narrative, our cross-temporal analysis reveals no consistent decline in median review quality across venues and years. We propose alternative explanations, and outline recommendations to facilitate future empirical studies of review quality.
Problem

Research questions and friction points this paper is trying to address.

peer review
review quality
scientific publishing
AI conferences
temporal analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

review quality
peer review
standardization
large language models
empirical analysis
🔎 Similar Papers
No similar papers found.
Ilia Kuznetsov
Ilia Kuznetsov
UKP Lab, TU Darmstadt
natural language processingscholarly AIpeer reviewintertextualityinterpretability
R
Rohan Nayak
Ubiquitous Knowledge Processing Lab (UKP Lab), Department of Computer Science, Technical University of Darmstadt and National Research Center for Applied Cybersecurity ATHENE
A
Alla Rozovskaya
Department of Computer Science at Queens College, City University of New York (CUNY)
Iryna Gurevych
Iryna Gurevych
Full Professor, TU Darmstadt; Adjunct Professor, MBZUAI, UAE; Affiliated Professor, INSAIT, Bulgaria
Natural Language ProcessingLarge Language ModelsArtificial Intelligence