Automating MD simulations for Proteins using Large language Models: NAMD-Agent

📅 2025-07-10

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Preparing input files for protein molecular dynamics (MD) simulations is time-consuming and error-prone. This paper introduces the first end-to-end solution integrating a large language model (Gemini 2.0 Flash) with web automation, implemented via Python and Selenium to intelligently invoke and iteratively optimize the CHARMM Graphical User Interface (GUI), thereby generating high-fidelity NAMD input files automatically. Our key contributions are: (i) the first LLM-driven, GUI-based interactive parameter generation and self-correcting workflow requiring zero manual intervention; and (ii) an integrated post-processing module enabling parallel configuration of multi-protein systems. Experimental evaluation demonstrates over 80% reduction in simulation setup time, substantial mitigation of human-induced errors, and high robustness and scalability. The proposed framework delivers an efficient, fully automated, and standardized workflow for computational biophysics.

Technology Category

Application Category

📝 Abstract

Molecular dynamics simulations are an essential tool in understanding protein structure, dynamics, and function at the atomic level. However, preparing high quality input files for MD simulations can be a time consuming and error prone process. In this work, we introduce an automated pipeline that leverages Large Language Models (LLMs), specifically Gemini 2.0 Flash, in conjunction with python scripting and Selenium based web automation to streamline the generation of MD input files. The pipeline exploits CHARMM GUI's comprehensive web-based interface for preparing simulation-ready inputs for NAMD. By integrating Gemini's code generation and iterative refinement capabilities, simulation scripts are automatically written, executed, and revised to navigate CHARMM GUI, extract appropriate parameters, and produce the required NAMD input files. Post processing is performed using additional software to further refine the simulation outputs, thereby enabling a complete and largely hands free workflow. Our results demonstrate that this approach reduces setup time, minimizes manual errors, and offers a scalable solution for handling multiple protein systems in parallel. This automated framework paves the way for broader application of LLMs in computational structural biology, offering a robust and adaptable platform for future developments in simulation automation.

Problem

Research questions and friction points this paper is trying to address.

Automating MD simulation input file preparation for proteins

Reducing time and errors in protein simulation setup

Integrating LLMs to streamline NAMD input generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages LLMs for automating MD input file generation

Integrates Gemini with Python and Selenium web automation

Uses CHARMM GUI for simulation-ready NAMD inputs

🔎 Similar Papers

AlphaFolding: 4D Diffusion for Dynamic Protein Structure Prediction with Reference and Motion Guidance