MDCrow: Automating Molecular Dynamics Workflows with Large Language Models

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

To address the challenge of automating biomolecular molecular dynamics (MD) simulation workflows, this paper introduces MDCrow—an intelligent agent powered by large language models (LLMs), specifically GPT-4o and Llama3-405B. MDCrow pioneers a chain-of-thought coordination mechanism that integrates over 40 domain-specific scientific tools, enabling end-to-end autonomous orchestration and robust execution of file preprocessing, simulation parameterization, result analysis, and literature retrieval. Leveraging multi-style prompt engineering and deep integration of specialized tools, it significantly advances automation in scientific computing. Evaluated on 25 diverse MD tasks spanning varying complexity levels, MDCrow achieves high success rates and low performance variance. Notably, it provides the first systematic empirical validation that open-weight LLMs—particularly Llama3-405B—exhibit strong competitiveness in intricate, domain-intensive scientific agent tasks.

Technology Category

Application Category

📝 Abstract

Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLM) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows. MDCrow uses chain-of-thought over 40 expert-designed tools for handling and processing files, setting up simulations, analyzing the simulation outputs, and retrieving relevant information from literature and databases. We assess MDCrow's performance across 25 tasks of varying required subtasks and difficulty, and we evaluate the agent's robustness to both difficulty and prompt style. exttt{gpt-4o} is able to complete complex tasks with low variance, followed closely by exttt{llama3-405b}, a compelling open-source model. While prompt style does not influence the best models' performance, it has significant effects on smaller models.

Problem

Research questions and friction points this paper is trying to address.

Automating molecular dynamics workflows

Handling complex scientific tasks

Evaluating LLM performance across tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM automates MD workflows

Chain-of-thought with expert tools

Robust across tasks and prompts

🔎 Similar Papers

An Autonomous Large Language Model Agent for Chemical Literature Data Mining