🤖 AI Summary
This work addresses the zero-shot out-of-distribution (OOD) prediction challenge in materials and molecular design—i.e., reliably predicting high-value performance properties beyond the training data distribution. Conventional inductive models are fundamentally limited by the scope of their training objectives. To overcome this, we propose the first analogical-reasoning-driven transductive OOD prediction paradigm, jointly optimizing chemical representation learning, graph neural networks, and OOD classification. Unlike standard approaches, our method makes no assumptions about the label distribution of target properties; instead, it achieves cross-distribution generalization by explicitly modeling analogical relationships among samples. Experiments demonstrate substantial improvements: on solid-materials OOD classification, true positive rate (TPR) increases 3× and precision 2×; on molecular property prediction, TPR improves 2.5× and precision 1.5×—all significantly surpassing non-transductive baselines.
📝 Abstract
Discovery of high-performance materials and molecules requires identifying extremes with property values that fall outside the known distribution. Therefore, the ability to extrapolate to out-of-distribution (OOD) property values is critical for both solid-state materials and molecular design. Our objective is to train predictor models that extrapolate zero-shot to higher ranges than in the training data, given the chemical compositions of solids or molecular graphs and their property values. We propose using a transductive approach to OOD property prediction, achieving improvements in prediction accuracy. In particular, the True Positive Rate (TPR) of OOD classification of materials and molecules improved by 3x and 2.5x, respectively, and precision improved by 2x and 1.5x compared to non-transductive baselines. Our method leverages analogical input-target relations in the training and test sets, enabling generalization beyond the training target support, and can be applied to any other material and molecular tasks.