Evaluating LLM-generated code for domain-specific languages: molecular dynamics with LAMMPS

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of effective evaluation methods for assessing the scientific validity of domain-specific language (DSL) code—such as LAMMPS molecular dynamics input scripts—generated by large language models (LLMs). To tackle this challenge, the authors propose a lightweight validation framework that combines input file normalization, an extensible DSL parser, and static syntactic and semantic checks. This approach enables domain experts to efficiently verify LLM-generated outputs without requiring deep expertise in the target DSL. By circumventing costly runtime execution, the framework facilitates systematic benchmarking of mainstream LLMs on scientific DSL generation tasks, revealing their current limitations. The study thus provides a practical pathway toward the safe integration of LLMs into specialized scientific computing workflows.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are changing the way researchers interact with code and data in scientific computing. While their ability to generate general-purpose code is well established, their effectiveness in producing scientifically valid code/input scripting for domain-specific languages (DSLs) remains largely unexplored. We propose an evaluation procedure that enables domain experts (who may not be experts in the DSL) to assess the validity of LLM-generated input files for LAMMPS, a widely used molecular dynamics (MD) code, and to use those assessments to evaluate the performance of state-of-the-art LLMs and identify common issues. Key to the evaluation procedure are a normalization step to generate canonical files and an extensible parser for syntax analysis. The following steps isolate common errors without incurring costly tests (in time and computational resources). Once a working input file is generated, LLMs can accelerate verification tests. Our findings highlight limitations of LLMs in generating scientific DSLs and a practical path forward for their integration into domain-specific computational ecosystems by domain experts.
Problem

Research questions and friction points this paper is trying to address.

large language models
domain-specific languages
molecular dynamics
LAMMPS
scientific code generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

domain-specific languages
large language models
LAMMPS
code validation
molecular dynamics
🔎 Similar Papers
No similar papers found.
E
Ethan Holbrook
School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana, 47907 USA
J
Juan C. Verduzco
School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana, 47907 USA
Alejandro Strachan
Alejandro Strachan
Reilly Professor of Materials Engineering, Purdue University
Predictive simulations of materialsMultiscale modelingTheoretical materials science