๐ค AI Summary
To address the high computational cost and memory footprint of machine-learned interatomic potentials (MLIPs) in large-scale molecular dynamics (MD) simulations, this work introduces knowledge distillation to MLIP training for the first time, proposing a teacherโstudent collaborative framework. A high-accuracy teacher model provides implicit supervision via atomic energy predictions, guiding the training of a lightweight student model with a customized, resource-efficient architecture. Remarkably, the student achieves superior accuracy to the teacher under identical training data. Experiments on benchmarks including QM9 demonstrate a 12% reduction in mean absolute error (MAE), a 2.3ร speedup in MD simulation throughput, and a fivefold reduction in memory consumption relative to the teacher. The core contribution lies in adapting the knowledge distillation paradigm to the MLIP domain, enabling simultaneous optimization of predictive accuracy, computational efficiency, and memory efficiency.
๐ Abstract
Machine learning inter-atomic potentials (MLIPs) are revolutionizing the field of molecular dynamics (MD) simulations. Recent MLIPs have tended towards more complex architectures trained on larger datasets. The resulting increase in computational and memory costs may prohibit the application of these MLIPs to perform large-scale MD simulations. Here, we present a teacher-student training framework in which the latent knowledge from the teacher (atomic energies) is used to augment the students' training. We show that the light-weight student MLIPs have faster MD speeds at a fraction of the memory footprint compared to the teacher models. Remarkably, the student models can even surpass the accuracy of the teachers, even though both are trained on the same quantum chemistry dataset. Our work highlights a practical method for MLIPs to reduce the resources required for large-scale MD simulations.