DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI scientist systems lack goal-directedness, hindering the generation of high-impact discoveries addressing urgent scientific challenges. Method: We propose the first goal-driven, fully automated scientific discovery framework, integrating Bayesian optimization with a closed-loop iterative cycle of hypothesis generation, experimental validation, and result analysis. The system maintains a cumulative discovery memory to dynamically balance exploration and exploitation. It incorporates large language model–driven hypothesis generation, automated experimental execution, and hierarchical evaluation. Contribution/Results: Supported by over 20,000 GPU-hours, the framework generated 5,000 scientific hypotheses and validated 1,100. On three cutting-edge AI tasks, it surpassed human state-of-the-art baselines by 183.7%, 1.9%, and 7.9%, respectively—marking the first demonstration of sustained AI superiority over manual design in long-term autonomous scientific research.

Technology Category

Application Category

📝 Abstract
While previous AI Scientist systems can generate novel findings, they often lack the focus to produce scientifically valuable contributions that address pressing human-defined challenges. We introduce DeepScientist, a system designed to overcome this by conducting goal-oriented, fully autonomous scientific discovery over month-long timelines. It formalizes discovery as a Bayesian Optimization problem, operationalized through a hierarchical evaluation process consisting of "hypothesize, verify, and analyze". Leveraging a cumulative Findings Memory, this loop intelligently balances the exploration of novel hypotheses with exploitation, selectively promoting the most promising findings to higher-fidelity levels of validation. Consuming over 20,000 GPU hours, the system generated about 5,000 unique scientific ideas and experimentally validated approximately 1100 of them, ultimately surpassing human-designed state-of-the-art (SOTA) methods on three frontier AI tasks by 183.7%, 1.9%, and 7.9%. This work provides the first large-scale evidence of an AI achieving discoveries that progressively surpass human SOTA on scientific tasks, producing valuable findings that genuinely push the frontier of scientific discovery. To facilitate further research into this process, we will open-source all experimental logs and system code at https://github.com/ResearAI/DeepScientist/.
Problem

Research questions and friction points this paper is trying to address.

Developing goal-oriented autonomous AI for scientific discovery
Overcoming limitations in generating valuable scientific contributions
Balancing hypothesis exploration with validation through Bayesian optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Bayesian Optimization for scientific discovery
Implements hierarchical hypothesize-verify-analyze evaluation
Leverages cumulative memory to balance exploration-exploitation
🔎 Similar Papers
No similar papers found.
Yixuan Weng
Yixuan Weng
Westlake University
Minjun Zhu
Minjun Zhu
Westlake University; CASIA
Natural Language Processing
Q
Qiujie Xie
Engineering School, Westlake University
Qiyao Sun
Qiyao Sun
QueenMary University of London
AI Scientist
Z
Zhen Lin
Engineering School, Westlake University
Sifan Liu
Sifan Liu
Duke University
Y
Yue Zhang
Engineering School, Westlake University