🤖 AI Summary
Scientific discovery faces persistent challenges—including data complexity, interdisciplinary collaboration barriers, and insufficient reproducibility—limiting the efficacy of conventional AI agents in research contexts. Method: This paper establishes a domain-specific LLM-based agent paradigm for scientific tasks, proposing a novel tripartite architecture: “domain-knowledge embedding—toolchain闭环—multi-layer verification,” formally characterizing the essential properties of scientific agents for the first time. It designs a multidimensional evaluation framework spanning hypothesis generation, experimental design, and data analysis, and integrates symbolic computation (SymPy), numerical libraries (NumPy), scholarly APIs, and simulation interfaces to enable interpretable, reproducible, and collaborative automated research workflows. Contribution/Results: Synthesizing over 100 state-of-the-art studies, the work identifies critical technical bottlenecks and ethical risks, and delivers the first high-fidelity, domain-adaptive roadmap for the systematic development of scientific agents.
📝 Abstract
As scientific research becomes increasingly complex, innovative tools are needed to manage vast data, facilitate interdisciplinary collaboration, and accelerate discovery. Large language models (LLMs) are now evolving into LLM-based scientific agents that automate critical tasks, ranging from hypothesis generation and experiment design to data analysis and simulation. Unlike general-purpose LLMs, these specialized agents integrate domain-specific knowledge, advanced tool sets, and robust validation mechanisms, enabling them to handle complex data types, ensure reproducibility, and drive scientific breakthroughs. This survey provides a focused review of the architectures, design, benchmarks, applications, and ethical considerations surrounding LLM-based scientific agents. We highlight why they differ from general agents and the ways in which they advance research across various scientific fields. By examining their development and challenges, this survey offers a comprehensive roadmap for researchers and practitioners to harness these agents for more efficient, reliable, and ethically sound scientific discovery.