The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review

📅 2024-08-24

📈 Citations: 4

✨ Influential: 1

career value

218K/year

🤖 AI Summary

This study addresses score bias and low inter-rater reliability in machine learning conference peer review. For the first time, it conducts a large-scale empirical study at ICML 2023, eliciting authors’ self-ranked orderings of their own multiple submissions by quality. We propose the Isotonic Mechanism—a calibration method grounded in isotonic regression—that maps raw reviewer scores to calibrated scores consistent with authors’ ordinal preferences, ensuring interpretability and minimal intervention risk. Experiments show that calibrated scores significantly outperform raw scores in predicting “ground-truth expected review scores,” with substantial reductions in MSE and MAE. Ordinal consistency analysis confirms the reliability of author-provided rankings. The mechanism has been operationally deployed to support area chair oversight, award selection, and emergency reviewer assignment—establishing a practical, deployable paradigm for enhancing human-AI collaborative review quality.

Technology Category

Application Category

📝 Abstract

We conducted an experiment during the review process of the 2023 International Conference on Machine Learning (ICML), asking authors with multiple submissions to rank their papers based on perceived quality. In total, we received 1,342 rankings, each from a different author, covering 2,592 submissions. In this paper, we present an empirical analysis of how author-provided rankings could be leveraged to improve peer review processes at machine learning conferences. We focus on the Isotonic Mechanism, which calibrates raw review scores using the author-provided rankings. Our analysis shows that these ranking-calibrated scores outperform the raw review scores in estimating the ground truth ``expected review scores'' in terms of both squared and absolute error metrics. Furthermore, we propose several cautious, low-risk applications of the Isotonic Mechanism and author-provided rankings in peer review, including supporting senior area chairs in overseeing area chairs' recommendations, assisting in the selection of paper awards, and guiding the recruitment of emergency reviewers.

Problem

Research questions and friction points this paper is trying to address.

Examining author self-assessment in ML/AI peer review

Improving peer review using author-provided rankings

Calibrating review scores with the Isotonic Mechanism

Innovation

Methods, ideas, or system contributions that make the work stand out.

Used Isotonic Mechanism for score calibration

Leveraged author rankings to improve peer review

Applied rankings for awards and reviewer selection

🔎 Similar Papers

No similar papers found.