Accelerating Scientific Discovery with Autonomous Goal-evolving Agents

📅 2025-12-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In scientific discovery, manually specified objective functions often fail to capture complex trade-offs, constituting a key bottleneck. This paper introduces SAGA (Scientific Autonomous Goal-evolving Agent), the first framework featuring a hierarchical goal-evolution architecture that automatically analyzes, generates, and computationally transforms objective functions—thereby transcending the fixed-objective paradigm. SAGA integrates large language model–based agents, multi-objective optimization, differentiable and symbolic scoring function synthesis, and a closed-loop scientific solver to enable dynamic goal evolution aligned with scientific semantics. Extensive experiments across four domains—antibiotic discovery, inorganic materials design, functional DNA screening, and chemical process optimization—demonstrate that SAGA significantly improves candidate Pareto-frontier quality (+37.2% on average) and accelerates discovery by 2.8×. This work provides the first systematic empirical validation that automated objective-function synthesis critically enhances the performance of scientific AI agents.

Technology Category

Application Category

📝 Abstract
There has been unprecedented interest in developing agents that expand the boundary of scientific discovery, primarily by optimizing quantitative objective functions specified by scientists. However, for grand challenges in science , these objectives are only imperfect proxies. We argue that automating objective function design is a central, yet unmet requirement for scientific discovery agents. In this work, we introduce the Scientific Autonomous Goal-evolving Agent (SAGA) to amend this challenge. SAGA employs a bi-level architecture in which an outer loop of LLM agents analyzes optimization outcomes, proposes new objectives, and converts them into computable scoring functions, while an inner loop performs solution optimization under the current objectives. This bi-level design enables systematic exploration of the space of objectives and their trade-offs, rather than treating them as fixed inputs. We demonstrate the framework through a broad spectrum of applications, including antibiotic design, inorganic materials design, functional DNA sequence design, and chemical process design, showing that automating objective formulation can substantially improve the effectiveness of scientific discovery agents.
Problem

Research questions and friction points this paper is trying to address.

Automates objective function design for scientific discovery agents
Enables systematic exploration of objective spaces and trade-offs
Improves effectiveness across diverse applications like antibiotic design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automating objective function design for agents
Bi-level architecture with LLM agents proposing objectives
Systematic exploration of objective space and trade-offs
Yuanqi Du
Yuanqi Du
PhD Student, Cornell University
Probabilistic ModelsGeometric Deep LearningAI for ScienceSampling/Optimization/Search
Botao Yu
Botao Yu
PhD student, Ohio State University
AI for ScienceNLPAI Music
T
Tianyu Liu
Yale University, New Haven, CT, USA
Tony Shen
Tony Shen
SFU, Broad Institute of MIT and Harvard
machine learningdrug discoverybiomolecular design
J
Junwu Chen
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Jan G. Rittig
Jan G. Rittig
EPFL, Laboratory of Artificial Chemical Intelligence (LIAC)
Molecular Machine LearningOptimizationChemical EngineeringGraph LearningHybrid Modeling
K
Kunyang Sun
University of California Berkeley, Berkeley, CA, USA
Y
Yikun Zhang
Northeastern University, Boston, MA, USA
Zhangde Song
Zhangde Song
Unknown affiliation
B
Bo Zhou
University of Illinois Chicago, Chicago, IL, USA
C
Cassandra Masschelein
École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Y
Yingze Wang
University of California Berkeley, Berkeley, CA, USA
Haorui Wang
Haorui Wang
PhD student, Gatech
Machine LearningLarge Language ModelsDecision MakingUncertainty Quantification
H
Haojun Jia
DeepPrinciple, Hangzhou, Zhejiang, China
C
Chao Zhang
Georgia Institute of Technology, Atlanta, GA, USA
Hongyu Zhao
Hongyu Zhao
Yale University
First interestSecond interest
M
Martin Ester
Simon Fraser University, Burnaby, BC, Canada
T
Teresa Head-Gordon
University of California Berkeley, Berkeley, CA, USA
C
Carla P. Gomes
Cornell University, Ithaca, NY, USA
Huan Sun
Huan Sun
Endowed CoE Innovation Scholar and Associate Professor, The Ohio State University
AgentsLarge Language ModelsNatural Language ProcessingAI
Chenru Duan
Chenru Duan
Deep Principle; MIT
computational chemistrymachine learningmolecular design
Philippe Schwaller
Philippe Schwaller
Assistant Professor, Laboratory of Artificial Chemical Intelligence - EPFL
Deep LearningML for ChemistryReaction PredictionSynthesis PlanningAccelerated Discovery
W
Wengong Jin
Northeastern University, Boston, MA, USA