Layer-to-Layer Knowledge Mixing in Graph Neural Network for Chemical Property Prediction

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Graph neural networks (GNNs) achieve strong performance in molecular property prediction, yet accuracy gains often incur substantial computational and memory overhead. To address this trade-off, we propose Layer-wise Knowledge Mixing (LKM), a self-knowledge distillation framework that minimizes embedding distance among hidden representations across layers, enabling efficient integration of multi-hop and multi-scale structural information without increasing inference cost. LKM is architecture-agnostic—compatible with DimeNet++, MXMNet, PAMNet, and others—and leverages pretrained layer embeddings for optimization. Evaluated on QM9, MD17, and Chignolin, LKM reduces mean prediction error by 9.8%, 45.3%, and 22.9%, respectively, significantly improving both generalization and accuracy. This work establishes a new paradigm for lightweight, high-fidelity molecular modeling.

Technology Category

Application Category

📝 Abstract
Graph Neural Networks (GNNs) are the currently most effective methods for predicting molecular properties but there remains a need for more accurate models. GNN accuracy can be improved by increasing the model complexity but this also increases the computational cost and memory requirement during training and inference. In this study, we develop Layer-to-Layer Knowledge Mixing (LKM), a novel self-knowledge distillation method that increases the accuracy of state-of-the-art GNNs while adding negligible computational complexity during training and inference. By minimizing the mean absolute distance between pre-existing hidden embeddings of GNN layers, LKM efficiently aggregates multi-hop and multi-scale information, enabling improved representation of both local and global molecular features. We evaluated LKM using three diverse GNN architectures (DimeNet++, MXMNet, and PAMNet) using datasets of quantum chemical properties (QM9, MD17 and Chignolin). We found that the LKM method effectively reduces the mean absolute error of quantum chemical and biophysical property predictions by up to 9.8% (QM9), 45.3% (MD17 Energy), and 22.9% (Chignolin). This work demonstrates the potential of LKM to significantly improve the accuracy of GNNs for chemical property prediction without any substantial increase in training and inference cost.
Problem

Research questions and friction points this paper is trying to address.

Improving GNN accuracy for molecular property prediction without computational overhead
Aggregating multi-scale molecular features through layer-to-layer knowledge distillation
Reducing prediction errors in quantum chemical properties with minimal cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-knowledge distillation method for GNN accuracy improvement
Minimizes distance between hidden embeddings in layers
Aggregates multi-hop and multi-scale molecular information
🔎 Similar Papers
No similar papers found.
T
Teng Jiek See
Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3068, Australia
Daokun Zhang
Daokun Zhang
University of Nottingham Ningbo China
Graph LearningData MiningMachine Learning
Mario Boley
Mario Boley
University of Haifa, Monash University
Interpretable Machine LearningMaterials InformaticsBranch-and-Bound
D
David K. Chalmers
Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3068, Australia