🤖 AI Summary
Graph Neural Networks (GNNs) exhibit limited generalization—particularly under out-of-distribution (OOD) conditions—in molecular property prediction. To address this, we propose an algorithm-guided pretraining paradigm: leveraging execution traces of 24 classical graph algorithms from the CLRS benchmark as structured priors, we explicitly inject algorithmic logic into GNNs via layer-wise initialization and parameter freezing, thereby endowing them with verifiable inductive biases. This work is the first to utilize algorithmic execution traces as supervisory signals for GNN pretraining. Evaluated on the Open Graph Benchmark, our method achieves substantial OOD generalization gains on ogbg-molhiv (HIV inhibition prediction) and ogbg-molclintox (clinical toxicity prediction), improving absolute performance by 6% and 3%, respectively—consistently surpassing randomly initialized baselines across all settings.
📝 Abstract
Neural networks excel at processing unstructured data but often fail to generalise out-of-distribution, whereas classical algorithms guarantee correctness but lack flexibility. We explore whether pretraining Graph Neural Networks (GNNs) on classical algorithms can improve their performance on molecular property prediction tasks from the Open Graph Benchmark: ogbg-molhiv (HIV inhibition) and ogbg-molclintox (clinical toxicity). GNNs trained on 24 classical algorithms from the CLRS Algorithmic Reasoning Benchmark are used to initialise and freeze selected layers of a second GNN for molecular prediction. Compared to a randomly initialised baseline, the pretrained models achieve consistent wins or ties, with the Segments Intersect algorithm pretraining yielding a 6% absolute gain on ogbg-molhiv and Dijkstra pretraining achieving a 3% gain on ogbg-molclintox. These results demonstrate embedding classical algorithmic priors into GNNs provides useful inductive biases, boosting performance on complex, real-world graph data.