🤖 AI Summary
In response to the growing challenge of identifying high-quality research outputs amid an exponential increase in academic publications, this paper proposes a large language model (LLM)-based evaluation framework integrating domain-aware retrieval and latent reasoning mechanisms. Methodologically, we design a domain-adaptive literature retrieval module to mitigate knowledge obsolescence; develop a context-aware, progressive reasoning mechanism to deeply model research motivation, methodological novelty, and cross-cutting contributions; and introduce a multi-stage ranking optimization strategy to enhance discriminative accuracy. Our framework achieves significant improvements over existing state-of-the-art methods on two authoritative benchmark datasets. Deployed in a real-world paper recommendation system, it serves over 8,000 users, with individual papers attaining more than 10,000 views—demonstrating strong practical efficacy and scalability in production environments.
📝 Abstract
With the rapid and continuous increase in academic publications, identifying high-quality research has become an increasingly pressing challenge. While recent methods leveraging Large Language Models (LLMs) for automated paper evaluation have shown great promise, they are often constrained by outdated domain knowledge and limited reasoning capabilities. In this work, we present PaperEval, a novel LLM-based framework for automated paper evaluation that addresses these limitations through two key components: 1) a domain-aware paper retrieval module that retrieves relevant concurrent work to support contextualized assessments of novelty and contributions, and 2) a latent reasoning mechanism that enables deep understanding of complex motivations and methodologies, along with comprehensive comparison against concurrently related work, to support more accurate and reliable evaluation. To guide the reasoning process, we introduce a progressive ranking optimization strategy that encourages the LLM to iteratively refine its predictions with an emphasis on relative comparison. Experiments on two datasets demonstrate that PaperEval consistently outperforms existing methods in both academic impact and paper quality evaluation. In addition, we deploy PaperEval in a real-world paper recommendation system for filtering high-quality papers, which has gained strong engagement on social media -- amassing over 8,000 subscribers and attracting over 10,000 views for many filtered high-quality papers -- demonstrating the practical effectiveness of PaperEval.