🤖 AI Summary
Non-expert users face significant barriers in efficiently utilizing first-principles simulation codes for materials modeling. Method: This paper proposes an AI agent framework integrating knowledge graphs, hierarchical large language models (LLMs), and finite-state machines to enable end-to-end automatic translation of natural-language instructions into executable simulation input files, while supporting protocol design, validation, and self-healing of errors. Contribution/Results: The framework introduces a knowledge-augmented reasoning architecture and a state-driven hallucination suppression mechanism—enabling, for the first time, autonomous simulation protocol generation and closed-loop error correction. Evaluated on 295 benchmark tasks, it achieves an 80% task success rate, 76% autonomous error repair rate, and reduces failure rate to 7%. Moreover, inference cost is halved compared to pure-LLM approaches, substantially advancing Integrated Computational Materials Engineering (ICME) toward low-barrier, high-reliability deployment.
📝 Abstract
Predictive atomistic simulations have propelled materials discovery, yet routine setup and debugging still demand computer specialists. This know-how gap limits Integrated Computational Materials Engineering (ICME), where state-of-the-art codes exist but remain cumbersome for non-experts. We address this bottleneck with GENIUS, an AI-agentic workflow that fuses a smart Quantum ESPRESSO knowledge graph with a tiered hierarchy of large language models supervised by a finite-state error-recovery machine. Here we show that GENIUS translates free-form human-generated prompts into validated input files that run to completion on $approx$80% of 295 diverse benchmarks, where 76% are autonomously repaired, with success decaying exponentially to a 7% baseline. Compared with LLM-only baselines, GENIUS halves inference costs and virtually eliminates hallucinations. The framework democratizes electronic-structure DFT simulations by intelligently automating protocol generation, validation, and repair, opening large-scale screening and accelerating ICME design loops across academia and industry worldwide.