Cross-functional transferability in universal machine learning interatomic potentials

📅 2025-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance degradation of universal machine-learned interatomic potentials (uMLIPs) when transferring across density functionals (e.g., GGA → r²SCAN), caused by energy-scale misalignment and weak correlation between low- and high-fidelity data. We propose a novel multi-fidelity transfer learning paradigm that— for the first time—identifies the critical role of elemental energy reference frames in cross-functional transfer, introduces an element-wise energy calibration mechanism, and integrates multi-stage fine-tuning with scaling-law analysis. Evaluated on the MP-r²SCAN dataset (240K structures), our method significantly improves data efficiency at sub-million-structure scales, drastically reducing the high-fidelity data requirements for constructing accurate potential energy surfaces compared to ab initio training. The approach establishes a generalizable, reproducible pathway for cross-functional knowledge reuse in uMLIPs, enabling robust transfer from widely available GGA-level data to higher-accuracy r²SCAN-level predictions.

Technology Category

Application Category

📝 Abstract
The rapid development of universal machine learning interatomic potentials (uMLIPs) has demonstrated the possibility for generalizable learning of the universal potential energy surface. In principle, the accuracy of uMLIPs can be further improved by bridging the model from lower-fidelity datasets to high-fidelity ones. In this work, we analyze the challenge of this transfer learning problem within the CHGNet framework. We show that significant energy scale shifts and poor correlations between GGA and r$^2$SCAN pose challenges to cross-functional data transferability in uMLIPs. By benchmarking different transfer learning approaches on the MP-r$^2$SCAN dataset of 0.24 million structures, we demonstrate the importance of elemental energy referencing in the transfer learning of uMLIPs. By comparing the scaling law with and without the pre-training on a low-fidelity dataset, we show that significant data efficiency can still be achieved through transfer learning, even with a target dataset of sub-million structures. We highlight the importance of proper transfer learning and multi-fidelity learning in creating next-generation uMLIPs on high-fidelity data.
Problem

Research questions and friction points this paper is trying to address.

Challenges in cross-functional data transferability in universal ML interatomic potentials
Energy scale shifts and poor correlations between GGA and r2SCAN
Importance of elemental energy referencing in transfer learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CHGNet framework for transfer learning
Implements elemental energy referencing technique
Combines multi-fidelity datasets for efficiency
🔎 Similar Papers
No similar papers found.
X
Xu Huang
1Department of Materials Science and Engineering, University of California, Berkeley, California 94720, United States; 2Materials Sciences Division, Lawrence Berkeley National Laboratory, California 94720, United States
Bowen Deng
Bowen Deng
Postdoc at MIT | PhD at UC Berkeley
Machine LearningAI for ScienceComputational MaterialsEnergy Materials
Peichen Zhong
Peichen Zhong
National University of Singapore, UC Berkeley, LBNL
Computational modelingAI for ScienceStatistical mechanicsMaterials designEnergy storage
A
Aaron D. Kaplan
2Materials Sciences Division, Lawrence Berkeley National Laboratory, California 94720, United States
K
Kristin A. Persson
1Department of Materials Science and Engineering, University of California, Berkeley, California 94720, United States; 2Materials Sciences Division, Lawrence Berkeley National Laboratory, California 94720, United States
Gerbrand Ceder
Gerbrand Ceder
Professor of Materials Science and Engineering
Materials designcomputational modelingenergy storagethermoelectricssolar