From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

233K/year
🤖 AI Summary
This paper addresses the paradigm shift arising from large language models (LLMs) evolving from research aids to autonomous scientific agents. Method: Through a systematic literature review and interdisciplinary analysis—integrating philosophy of science, human-AI collaboration theory, and AI governance—we propose the first three-tiered autonomy framework for LLMs in scientific discovery (Tool → Analyst → Scientist), rigorously delineating capability boundaries and evolutionary trajectories across the full scientific workflow: hypothesis generation, experimental design, and self-reflection. We anchor our conceptual architecture and technical roadmap in scientific methodology. Contribution/Results: The work delivers the field’s first comprehensive survey, establishes foundational theoretical grounding for autonomous scientific agents, and releases Awesome-LLM-Scientific-Discovery—an open-source knowledge repository supporting both theoretical advancement and practical development.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are catalyzing a paradigm shift in scientific discovery, evolving from task-specific automation tools into increasingly autonomous agents and fundamentally redefining research processes and human-AI collaboration. This survey systematically charts this burgeoning field, placing a central focus on the changing roles and escalating capabilities of LLMs in science. Through the lens of the scientific method, we introduce a foundational three-level taxonomy-Tool, Analyst, and Scientist-to delineate their escalating autonomy and evolving responsibilities within the research lifecycle. We further identify pivotal challenges and future research trajectories such as robotic automation, self-improvement, and ethical governance. Overall, this survey provides a conceptual architecture and strategic foresight to navigate and shape the future of AI-driven scientific discovery, fostering both rapid innovation and responsible advancement. Github Repository: https://github.com/HKUST-KnowComp/Awesome-LLM-Scientific-Discovery.
Problem

Research questions and friction points this paper is trying to address.

LLMs' evolving roles in scientific discovery
Autonomy levels of LLMs in research
Challenges in AI-driven scientific innovation
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs evolve from tools to autonomous science agents
Three-level taxonomy: Tool, Analyst, Scientist autonomy
Focus on robotic automation, self-improvement, ethics
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
T
Tianshi ZHENG
Department of Computer Science and Engineering, HKUST, Hong Kong SAR, China
Zheye Deng
Zheye Deng
HKUST
Large Language ModelsText-to-StructureAgent Reinforcement Learning
H
Hong Ting Tsang
Department of Computer Science and Engineering, HKUST, Hong Kong SAR, China
W
Weiqi Wang
Department of Computer Science and Engineering, HKUST, Hong Kong SAR, China
Jiaxin Bai
Jiaxin Bai
Hong Kong University of Science and Technology
Natual Language Processing
Z
Zihao Wang
Department of Computer Science and Engineering, HKUST, Hong Kong SAR, China
Yangqiu Song
Yangqiu Song
HKUST
Artificial IntelligenceData MiningNatural Language ProcessingKnowledge GraphsCommonsense Reasoning