Teacher-student training improves accuracy and efficiency of machine learning inter-atomic potentials

๐Ÿ“… 2025-02-07
๐Ÿ“ˆ Citations: 3
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high computational cost and memory footprint of machine-learned interatomic potentials (MLIPs) in large-scale molecular dynamics (MD) simulations, this work introduces knowledge distillation to MLIP training for the first time, proposing a teacherโ€“student collaborative framework. A high-accuracy teacher model provides implicit supervision via atomic energy predictions, guiding the training of a lightweight student model with a customized, resource-efficient architecture. Remarkably, the student achieves superior accuracy to the teacher under identical training data. Experiments on benchmarks including QM9 demonstrate a 12% reduction in mean absolute error (MAE), a 2.3ร— speedup in MD simulation throughput, and a fivefold reduction in memory consumption relative to the teacher. The core contribution lies in adapting the knowledge distillation paradigm to the MLIP domain, enabling simultaneous optimization of predictive accuracy, computational efficiency, and memory efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
Machine learning inter-atomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework in which the latent knowledge from the teacher (atomic energies) is used to augment the students' training. We show that the light-weight student MLIPs have faster MD speeds at a fraction of the memory footprint compared to the teacher models. Remarkably, the student models can even surpass the accuracy of the teachers, even though both are trained on the same quantum chemistry dataset. Our work highlights a practical method for MLIPs to reduce the resources required for large-scale MD simulations.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational costs of machine learning interatomic potentials
Improving efficiency in large-scale molecular dynamics simulations
Enhancing accuracy of lightweight interatomic potential models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Teacher-student training enhances MLIP accuracy
Light-weight student models reduce memory footprint
Same dataset boosts student beyond teacher performance
๐Ÿ”Ž Similar Papers
No similar papers found.
Sakib Matin
Sakib Matin
Postdoc, Los Alamos National Laboratory
Machine LearningStatistical MechanicsBiophysicsNon-linear Dynamics
A
Alice E. A. Allen
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA; Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546; Max Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz
Emily Shinkle
Emily Shinkle
Scientist, Los Alamos National Laboratory
Aleksandra Pachalieva
Aleksandra Pachalieva
Los Alamos National Laboratory
Computational Fluid MechanicsMachine Learning
G
G. Craven
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
B
B. Nebgen
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
J
Justin S. Smith
Nvidia Corporation, Santa Clara, California, 9505, USA
Richard A. Messerly
Richard A. Messerly
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Y
Ying Wai Li
Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA
Sergei Tretiak
Sergei Tretiak
Los Alamos National Laboratory
Theoretical Chemistryquantum chemistryquantum dynamics
Kipton Barros
Kipton Barros
Los Alamos National Laboratory
Statistical physicsComputational physicsMachine learning
N
N. Lubbers
Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico, USA