🤖 AI Summary
This study investigates whether current AI research agents genuinely expand the frontiers of scientific exploration or merely operate within the vicinity of existing work. Leveraging identical seed papers, the authors employ four AI research agent frameworks combined with six large language models to generate a substantial corpus of research ideas across multiple AI/ML domains. Through systematic comparison—using citation-defined research fields and textual similarity analyses—between AI-generated outputs, human-authored papers, and subsequent scholarly work, this large-scale empirical analysis reveals that AI-generated ideas are highly concentrated near the seed literature, primarily recombining established methods rather than posing novel scientific questions. Moreover, papers derived from these AI-generated ideas receive significantly fewer subsequent citations, indicating that contemporary AI systems excel at local refinement but remain ill-suited for pioneering scientific discovery.
📝 Abstract
AI research agents can now generate research ideas, design experiments, run code, and draft papers, raising the possibility of large-scale AI-assisted scientific discovery. Many current agent frameworks explicitly encourage the generation of novel and high-impact ideas. Yet it remains unclear whether AI-assisted ideation broadens scientific exploration or mainly concentrates around existing work. We study AI research agents as scientific search systems. Using four AI research-agent frameworks and six large language models, we generate 37,802 scientific ideas from shared seed literature across citation-defined research areas in AI and machine learning. We then compare the resulting AI ideas against human-authored papers from the same research areas, follow-on human research emerging from the same seed literature, and the seed literature itself. Across experiments, four consistent patterns emerge. First, AI-generated ideas are substantially more concentrated than human-authored papers from the same research areas. Second, AI-generated ideas remain much closer to their starting literature than later human follow-on work does. Third, papers most similar to AI-generated ideas tend to receive lower subsequent citations. Fourth, when AI-generated ideas differ from prior work, the differences arise primarily from recombining existing technical methods rather than introducing fundamentally new research questions. Overall, current AI research agents appear better suited to local elaboration than to broadening scientific exploration.