🤖 AI Summary
Neural network interatomic potentials (NIPs) struggle to directly predict electronic or macroscopic properties, while machine learning models face a trade-off between generalizability and computational cost in structure–property mapping. To address this, we propose HackNIP: a two-stage feature transfer framework. Its core innovation lies in reusing fixed-length, physics-informed structural embeddings extracted from intermediate layers of pretrained NIPs as universal descriptors, followed by lightweight ML models for downstream property prediction. This approach avoids expensive end-to-end training and significantly improves generalization and data efficiency—especially in low-data regimes. Evaluated on the Matbench multitask benchmark, HackNIP outperforms conventional deep learning models across ab initio, experimental, and molecular property prediction tasks. Moreover, its performance consistently improves with deeper embedding representations, demonstrating scalability and robustness.
📝 Abstract
Large-scale foundation models, including neural network interatomic potentials (NIPs) in computational materials science, have demonstrated significant potential. However, despite their success in accelerating atomistic simulations, NIPs face challenges in directly predicting electronic properties and often require coupling to higher-scale models or extensive simulations for macroscopic properties. Machine learning (ML) offers alternatives for structure-to-property mapping but faces trade-offs: feature-based methods often lack generalizability, while deep neural networks require significant data and computational power. To address these trade-offs, we introduce HackNIP, a two-stage pipeline that leverages pretrained NIPs. This method first extracts fixed-length feature vectors (embeddings) from NIP foundation models and then uses these embeddings to train shallow ML models for downstream structure-to-property predictions. This study investigates whether such a hybridization approach, by ``hacking" the NIP, can outperform end-to-end deep neural networks, determines the dataset size at which this transfer learning approach surpasses direct fine-tuning of the NIP, and identifies which NIP embedding depths yield the most informative features. HackNIP is benchmarked on Matbench, evaluated for data efficiency, and tested on diverse tasks including extit{ab initio}, experimental, and molecular properties. We also analyze how embedding depth impacts performance. This work demonstrates a hybridization strategy to overcome ML trade-offs in materials science, aiming to democratize high-performance predictive modeling.