Autonomous Microscopy Experiments through Large Language Model Agents

📅 2024-12-18
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Existing self-driving laboratories (SDLs) rely on static experimental protocols, limiting their ability to emulate scientists’ adaptive reasoning and intuition in dynamic environments. Method: We propose AILA, the first large language model (LLM)-based autonomous agent system for end-to-end atomic force microscopy (AFM) experimentation—encompassing experimental design, execution, analysis, and closed-loop decision-making. Contribution/Results: We introduce AFMBench, the first benchmark for evaluating LLMs in AFM-driven scientific discovery, uncovering critical deficiencies in multi-agent coordination (73% failure rate), instruction following, and safety alignment, while empirically delineating LLMs’ scientific reasoning boundaries. Leveraging task-decomposition prompting, hardware interface integration, and a multi-agent architecture, AILA achieves autonomous AFM calibration, high-resolution feature identification, and nanomechanical property quantification. Results further reveal substantial accuracy degradation in foundational tasks (e.g., document retrieval), underscoring robustness and trustworthiness as central challenges in AI for Science.

Technology Category

Application Category

📝 Abstract
The emergence of large language models (LLMs) has accelerated the development of self-driving laboratories (SDLs) for materials research. Despite their transformative potential, current SDL implementations rely on rigid, predefined protocols that limit their adaptability to dynamic experimental scenarios across different labs. A significant challenge persists in measuring how effectively AI agents can replicate the adaptive decision-making and experimental intuition of expert scientists. Here, we introduce AILA (Artificially Intelligent Lab Assistant), a framework that automates atomic force microscopy (AFM) through LLM-driven agents. Using AFM as an experimental testbed, we develop AFMBench-a comprehensive evaluation suite that challenges AI agents based on language models like GPT-4o and GPT-3.5 to perform tasks spanning the scientific workflow: from experimental design to results analysis. Our systematic assessment shows that state-of-the-art language models struggle even with basic tasks such as documentation retrieval, leading to a significant decline in performance in multi-agent coordination scenarios. Further, we observe that LLMs exhibit a tendency to not adhere to instructions or even divagate to additional tasks beyond the original request, raising serious concerns regarding safety alignment aspects of AI agents for SDLs. Finally, we demonstrate the application of AILA on increasingly complex experiments open-ended experiments: automated AFM calibration, high-resolution feature detection, and mechanical property measurement. Our findings emphasize the necessity for stringent benchmarking protocols before deploying AI agents as laboratory assistants across scientific disciplines.
Problem

Research questions and friction points this paper is trying to address.

Current SDLs lack adaptability in dynamic experimental settings
Domain-specific QA proficiency doesn't ensure effective agentic capabilities
LLMs exhibit prompt fragility and safety alignment issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven agents automate atomic force microscopy
Multi-agent frameworks outperform single-agent architectures
Comprehensive evaluation suite AFMBench for AI agents
🔎 Similar Papers
No similar papers found.
Indrajeet Mandal
Indrajeet Mandal
School of Interdisciplinary Research, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
J
Jitendra Soni
Department of Materials Science and Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
Mohd Zaki
Mohd Zaki
Postdoctoral Researcher, Hopkins Extreme Materials Institute, Johns Hopkins University
Civil EngineeringMaterial ScienceMachine Learning
M
M. Smedskjaer
Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
K
Katrin Wondraczek
Leibniz Institute of Photonic Technology, 07745 Jena, Germany
Lothar Wondraczek
Lothar Wondraczek
Professor of Glass Science, University of Jena
amorphous materialsglass scienceglass ceramicsglass transition
N
N. Gosvami
School of Interdisciplinary Research, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India; Department of Materials Science and Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India; Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
N
N. M. A. Krishnan
School of Interdisciplinary Research, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India; Department of Civil Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India; Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India