🤖 AI Summary
To address the challenges of scientific idea exhaustion and difficulty in acquiring cross-disciplinary insights, this paper introduces SciMuse—a system that integrates a knowledge graph constructed from 58 million scholarly papers with large language models (LLMs) to autonomously generate high-potential, personalized research ideas. Methodologically, it proposes a novel hybrid interest prediction framework combining a supervised neural ranking model with zero-shot LLM-based semantic similarity assessment, augmented by knowledge graph embeddings and prompt engineering. Key contributions include: (1) releasing the first publicly available benchmark dataset for scientific interest prediction; (2) conducting human evaluation with over 100 domain experts ranking 4,400 generated ideas, demonstrating statistically significant improvement in human interest scores (p < 0.001); and (3) empirically validating that cross-disciplinary ideas achieve higher acceptance rates. This work advances AI-powered research assistance from generic content generation toward trustworthy, user-aligned idea recommendation.
📝 Abstract
The rapid growth of scientific literature makes it challenging for researchers to identify novel and impactful ideas, especially across disciplines. Modern artificial intelligence (AI) systems offer new approaches, potentially inspiring ideas not conceived by humans alone. But how compelling are these AI-generated ideas, and how can we improve their quality? Here, we introduce SciMuse, which uses 58 million research papers and a large-language model to generate research ideas. We conduct a large-scale evaluation in which over 100 research group leaders -- from natural sciences to humanities -- ranked more than 4,400 personalized ideas based on their interest. This data allows us to predict research interest using (1) supervised neural networks trained on human evaluations, and (2) unsupervised zero-shot ranking with large-language models. Our results demonstrate how future systems can help generating compelling research ideas and foster unforeseen interdisciplinary collaborations.