Transfer learning from first-principles calculations to experiments with chemistry-informed domain transformation

๐Ÿ“… 2025-03-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Catalyst performance prediction is hindered by scarcity of experimental data. Method: This paper proposes a cheminformatics-driven domain transformation method that enables efficient transfer of first-principles computational data to the experimental space. It innovatively integrates statistical ensemble theory with prior physical relationships between source (computational) and target (experimental) quantities, establishing an interpretable heterogeneous domain alignment framework. Chemical knowledge guides domain adaptation to explicitly mitigate distributional shift between computational and experimental data. Contribution/Results: The method achieves prediction accuracy comparable to models trained from scratch on over 100 experimental samplesโ€”using fewer than 10 experimental data points. This drastically reduces experimental trial-and-error costs and establishes a generalizable transfer learning paradigm for small-sample inverse materials design.

Technology Category

Application Category

๐Ÿ“ Abstract
Simulation-to-Real (Sim2Real) transfer learning, the machine learning technique that efficiently solves a real-world task by leveraging knowledge from computational data, has received increasing attention in materials science as a promising solution to the scarcity of experimental data. We proposed an efficient transfer learning scheme from first-principles calculations to experiments based on the chemistry-informed domain transformation, that integrates the heterogeneous source and target domains by harnessing the underlying physics and chemistry. The proposed method maps the computational data from the simulation space (source domain) into the space of experimental data (target domain). During this process, these qualitatively different domains are efficiently integrated by a couple of prior knowledge of chemistry, (1) the statistical ensemble, and (2) the relationship between source and target quantities. As a proof-of-concept, we predict the catalyst activity for the reverse water-gas shift reaction by using the abundant first-principles data in addition to the experimental data. Through the demonstration, we confirmed that the transfer learning model exhibits positive transfer in accuracy and data efficiency. In particular, a significantly high accuracy was achieved despite using a few (less than ten) target data in domain transformation, whose accuracy is one order of magnitude smaller than that of a full scratch model trained with over 100 target data. This result indicates that the proposed method leverages the high prediction performance with few target data, which helps to save the number of trials in real laboratories.
Problem

Research questions and friction points this paper is trying to address.

Transfer learning from simulations to experiments in materials science
Integrating computational and experimental data using chemistry principles
Improving catalyst activity prediction with limited experimental data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transfer learning from simulations to experiments
Chemistry-informed domain transformation technique
High accuracy with few experimental data
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yuta Yahagi
NEC Corporation, Minato-ku, Tokyo, Japan, 108-8001; National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan, 305-8568
K
Kiichi Obuchi
NEC Corporation, Minato-ku, Tokyo, Japan, 108-8001; National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan, 305-8568
F
Fumihiko Kosaka
National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan, 305-8568
Kota Matsui
Kota Matsui
Kyoto University
machine learningstatisticsbiostatistics