🤖 AI Summary
To address the time-consuming, subjective, and non-scalable nature of manual literature reviews in assessing scientific idea novelty, this paper proposes a retrieval-augmented generation (RAG) framework. Methodologically, it employs a two-stage retrieval process coupled with a facet-aware large language model (LLM) re-ranking mechanism, integrating keyword/phrase matching, embedding-based filtering, and literature-anchored generative reasoning; expert-annotated examples further enhance interpretability. The core contribution lies in incorporating structured bibliographic facets—such as methodology, problem domain, and technical approach—into re-ranking, enabling fine-grained, traceable novelty assessment. Experiments demonstrate a ~13% improvement in inter-annotator agreement for novelty classification over baseline methods. Ablation studies confirm the critical role of the facet-aware re-ranking module, significantly boosting identification of highly relevant prior work and improving system robustness.
📝 Abstract
Automated scientific idea generation systems have made remarkable progress, yet the automatic evaluation of idea novelty remains a critical and underexplored challenge. Manual evaluation of novelty through literature review is labor-intensive, prone to error due to subjectivity, and impractical at scale. To address these issues, we propose the Idea Novelty Checker, an LLM-based retrieval-augmented generation (RAG) framework that leverages a two-stage retrieve-then-rerank approach. The Idea Novelty Checker first collects a broad set of relevant papers using keyword and snippet-based retrieval, then refines this collection through embedding-based filtering followed by facet-based LLM re-ranking. It incorporates expert-labeled examples to guide the system in comparing papers for novelty evaluation and in generating literature-grounded reasoning. Our extensive experiments demonstrate that our novelty checker achieves approximately 13% higher agreement than existing approaches. Ablation studies further showcases the importance of the facet-based re-ranker in identifying the most relevant literature for novelty evaluation.