CrystalREPA: Transferring Physical Priors from Universal MLIPs to Crystal Generative Models

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
Existing crystal generative models often lack explicit physical constraints, resulting in generated structures with insufficient thermodynamic stability. This work proposes CrystalREPA, a framework that seamlessly integrates physical priors into the generative process by aligning atomic-level representations of the generative model with those of a pretrained machine-learned interatomic potential (MLIP) during training. The method innovatively employs element-aware contrastive learning to align the hidden states of the generative encoder with those of a frozen MLIP, enabling plug-and-play compatibility without additional inference overhead. Notably, the study demonstrates that the choice of MLIP teacher should prioritize the distinguishability of its atomic representations over conventional accuracy metrics. Evaluated across three generative architectures, ten MLIPs, and two benchmark datasets, CrystalREPA consistently enhances the stability, validity, and fidelity of generated crystals.
📝 Abstract
Crystal generative models mainly learn what stable crystals look like, with little explicit supervision for what makes them stable. We reveal a substantial representation gap between state-of-the-art crystal generative models and pretrained universal machine learning interatomic potentials (MLIPs) via energy probing, and show this gap can be closed by a simple training-time alignment. We propose Crystal REPresentation Alignment (CrystalREPA), a plug-and-play framework that aligns the atom-wise hidden states of generative encoders with frozen MLIP representations through an element-aware contrastive objective, transferring stability-aware atomistic priors with marginal training overhead and no additional inference cost. Across three generative frameworks, ten MLIP teachers, and two benchmark datasets, CrystalREPA consistently improves the thermodynamic stability, structural validity, and structural fidelity of generated crystals. Equally important, we find that an MLIP's transfer effectiveness is poorly predicted by its accuracy on standard leaderboards (e.g., Matbench Discovery) but strongly predicted by the distinguishability of its atom-wise representation space, yielding a practical, accuracy-independent criterion for selecting MLIP teachers for generative transfer.
Problem

Research questions and friction points this paper is trying to address.

crystal generative models
machine learning interatomic potentials
representation gap
thermodynamic stability
structural validity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Crystal generative models
Machine learning interatomic potentials (MLIPs)
Representation alignment
Contrastive learning
Thermodynamic stability
🔎 Similar Papers
C
Chengqian Zhang
AI for Science Institute, Beijing, China; Center for Data Science, Peking University, Beijing, China
Yucheng Jin
Yucheng Jin
Assistant Professor, Duke Kunshan University
Human-Centered AIHuman-Computer InteractionRecommender SystemsDigital WellbeingMusic
Duo Zhang
Duo Zhang
Twitter, Inc.
Text MiningInformation RetrievalData MiningMachine LearningSocial Networks
T
Tiejun Li
Center for Data Science, Peking University, Beijing, China; LMAM and School of Mathematical Sciences, Peking University, Beijing, China; Center for Machine Learning Research, Peking University, Beijing, China
Han Wang
Han Wang
Peking University School of Pharmaceutical Sciences
Artificial IntelligenceAI4ScienceMolecule GenerationDrug Discovery