MDCrow: Automating Molecular Dynamics Workflows with Large Language Models

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of automating biomolecular molecular dynamics (MD) simulation workflows, this paper introduces MDCrow—an intelligent agent powered by large language models (LLMs), specifically GPT-4o and Llama3-405B. MDCrow pioneers a chain-of-thought coordination mechanism that integrates over 40 domain-specific scientific tools, enabling end-to-end autonomous orchestration and robust execution of file preprocessing, simulation parameterization, result analysis, and literature retrieval. Leveraging multi-style prompt engineering and deep integration of specialized tools, it significantly advances automation in scientific computing. Evaluated on 25 diverse MD tasks spanning varying complexity levels, MDCrow achieves high success rates and low performance variance. Notably, it provides the first systematic empirical validation that open-weight LLMs—particularly Llama3-405B—exhibit strong competitiveness in intricate, domain-intensive scientific agent tasks.

Technology Category

Application Category

📝 Abstract
Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLM) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows. MDCrow uses chain-of-thought over 40 expert-designed tools for handling and processing files, setting up simulations, analyzing the simulation outputs, and retrieving relevant information from literature and databases. We assess MDCrow's performance across 25 tasks of varying required subtasks and difficulty, and we evaluate the agent's robustness to both difficulty and prompt style. exttt{gpt-4o} is able to complete complex tasks with low variance, followed closely by exttt{llama3-405b}, a compelling open-source model. While prompt style does not influence the best models' performance, it has significant effects on smaller models.
Problem

Research questions and friction points this paper is trying to address.

Automating molecular dynamics workflows
Handling complex scientific tasks
Evaluating LLM performance across tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM automates MD workflows
Chain-of-thought with expert tools
Robust across tasks and prompts
🔎 Similar Papers
No similar papers found.
Q
Quintina Campbell
Department of Chemical Engineering, University of Rochester, Rochester, New York, USA; FutureHouse Inc., San Francisco, CA
Sam Cox
Sam Cox
FutureHouse
computational chemistrymachine learning
J
Jorge Medina
Department of Chemical Engineering, University of Rochester, Rochester, New York, USA
B
Brittany Watterson
Department of Biomedical Engineering, University of Rochester, Rochester, New York, USA
Andrew D. White
Andrew D. White
FutureHouse, University of Rochester
AI Scientist