π€ AI Summary
The scarcity of experimental crystal structure data severely limits the training and validation of machine-learned interatomic potentials. Method: We propose an end-to-end intelligent agent framework that automatically reconstructs simulation-ready atomic structures from STEM images and predicts material properties. Our approach introduces the first STEM2Mat-Bench benchmark; pioneers a paradigm integrating pattern-adaptive denoising, physics-guided template retrieval, and symmetry-aware atomic reconstruction; and synergistically combines STEM image processing, physics-constrained modeling, symmetry analysis, MatterSim-based rapid relaxation, and LLM-driven tool orchestration with closed-loop reasoning. Contribution/Results: Remarkably, a text-only large language model outperforms multimodal models in this task. On 450 structural samples, our method achieves significantly lower lattice RMSD and formation energy MAE, higher structure matching success rate, and reduced computational cost compared to state-of-the-art approaches.
π Abstract
Machine learning-based interatomic potentials and force fields depend critically on accurate atomic structures, yet such data are scarce due to the limited availability of experimentally resolved crystals. Although atomic-resolution electron microscopy offers a potential source of structural data, converting these images into simulation-ready formats remains labor-intensive and error-prone, creating a bottleneck for model training and validation. We introduce AutoMat, an end-to-end, agent-assisted pipeline that automatically transforms scanning transmission electron microscopy (STEM) images into atomic crystal structures and predicts their physical properties. AutoMat combines pattern-adaptive denoising, physics-guided template retrieval, symmetry-aware atomic reconstruction, fast relaxation and property prediction via MatterSim, and coordinated orchestration across all stages. We propose the first dedicated STEM2Mat-Bench for this task and evaluate performance using lattice RMSD, formation energy MAE, and structure-matching success rate. By orchestrating external tool calls, AutoMat enables a text-only LLM to outperform vision-language models in this domain, achieving closed-loop reasoning throughout the pipeline. In large-scale experiments over 450 structure samples, AutoMat substantially outperforms existing multimodal large language models and tools. These results validate both AutoMat and STEM2Mat-Bench, marking a key step toward bridging microscopy and atomistic simulation in materials science.The code and dataset are publicly available at https://github.com/yyt-2378/AutoMat and https://huggingface.co/datasets/yaotianvector/STEM2Mat.